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The  problem  of  computing  the  orbit  of  a  visual  binary 
from  a  set  of  observed  positions  is  reconsidered.   It  is  a 
least  squares  adjustment  problem,  if  the  observational 
errors  follow  a  bias-free  multivariate  Gaussian  distribution 
and  the  covariance  matrix  of  the  observations  is  assumed  to 
be  known. 

The  condition  equations  are  constructed  to  satisfy  both 
the  conic  section  equation  and  the  area  theorem,  which  are 
nonlinear  in  both  the  observations  and  the  adjustment 
parameters.   The  traditional  least  squares  algorithm,  which 
employs  condition  equations  that  are  solved  with  respect  to 
the  uncorrelated  observations  and  either  linear  in  the 
adjustment  parameters  or  linearized  by  developing  them  in 
Taylor  series  by  first-order  approximation,  is  inadequate  in 
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our  orbit  problem.   D.  C.  Brown  proposed  an  algorithm 


solving  a  more  general  least  squares  adjustment  problem  in 
which  the  scalar  residual  function,  however,  is  still 
constructed  by  first-order  approximation.  Not  long  ago,  a 
completely  general  solution  was  published  by  W.  H.  Jefferys, 
who  proposed  a  rigorous  adjustment  algorithm  for  models  in 
which  the  observations  appear  nonlinearly  in  the  condition 
equations  and  may  be  correlated,  and  in  which  construction 
of  the  normal  equations  and  the  residual  function  involves 
no  approximation.  This  method  was  successfully  applied  in 
our  problem. 

The  normal  equations  were  first  solved  by  Newton's 
scheme.   Practical  examples  show  that  this  converges  fast  if 
the  observational  errors  are  sufficiently  small  and  the 
initial  approximate  solution  is  sufficiently  accurate,  and 
that  it  fails  otherwise.   Newton's  method  was  modified  to 
yield  a  definitive  solution  in  the  case  the  normal  approach 
fails,  by  combination  with  the  method  of  steepest  descent 
and  other  sophisticated  algorithms.   Practical  examples  show 
that  the  modified  Newton  scheme  can  always  lead  to  a  final 
solution. 

The  weighting  of  observations,  the  orthogonal  para- 
meters and  the  "efficiency"  of  a  set  of  adjustment 
parameters  are  also  considered.   The  definition  of 


"efficiency"  is  revised. 
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CHAPTER  I 
INTRODUCTION 

The  problem  of  computing  the  orbit  of  a  visual  binary 
from  a  set  of  observed  positions  is  by  no  means  new. 
A  great  variety  of  methods  has  been  proposed.   As  is  well 
known,  only  a  few  of  these  suffice  to  cover  the  practical 
contingencies,  and  the  majority  fails  to  handle  the  input 
data  efficiently  and  properly. 

For  the  visual  binary  case,  the  determination  of  an 
orbit  normally  requires  a  large  number  of  observations.   All 
measures  of  position  angles  and  separations  are,  as  are  all 
observations,  affected  by  observational  errors.   For  the 
purpose  of  our  work,  these  errors  are  assumed  to  follow  a 
bias-free  multivariate  Gaussian  distribution.   Under  this 
assumption,  orbit-computing  is  a  least  squares  adjustment 
problem,  in  which  the  condition  equations  are  nonlinear  in 
both  the  observations  and  the  adjustment  parameters.   The 
condition  equations  must  incorporate  all  relationships  that 
exist  between  observations  and  the  orbital  parameters, 


*Usually,  the  term  "orbital  elements"  is  used.   We  prefer 
"orbital  parameters"  instead.   Strictly  speaking,  the 
orbital  elements  are  the  constants  of  integration  in  the 
two-body  problem  and,  therefore,  do  not  include  the  masses 
of  the  components . 


that  is,  must  state  both  the  area  theorem,  which  follows 
from  Kepler's  equation,  and  the  condition  that  the  projected 
orbit  is  a  conic  section.   H.  Eichhorn  (1985)  has  suggested 
a  new  form  for  the  construction  of  a  complete  set  of 
condition  equations  for  this  very  problem. 

The  traditional  least  squares  algorithm,  which  is  based 
on  condition  equations  linearized  with  respect  to  the 
observational  errors,  will  not  lead  to  those  orbital 
parameters  which  minimize  the  sum  of  the  squares  of  observa- 


tional  errors,  because  linearization,  in  this  case,  is  too 
crude  an  approximation.   In  some  earlier  papers,  H.  Eichhorn 
and  W.  G.  Clary  (1974)  proposed  a  least  squares  solution 
algorithm,  which  takes  into  account  the  second  (in  addition 
to  the  first)  order  derivatives  in  the  adjustment  residuals 
(observational  errors)  and  the  corrections  to  the  initially 
available  approximation  to  the  adjustment  parameters.   A 
completely  general  solution  was  published  by  W.  H.  Jefferys 
(1980,  1981),  who  proposed  a  rigorous  adjustment  algorithm 
for  models  in  which  the  observations  appear  nonlinearly  in 
the  condition  equations.   In  addition,  there  may  be  non- 
linear constraints**  among  model  parameters,  and  the 
observations  may  be  correlated.   In  practice,  the  method  is 
nearly  as  simple  to  apply  as  the  classical  method  of  least 


**We  use  the  term  "constraints"  for  condition  equations 
which  do  not  contain  any  observations  explicitly. 


squares,  for  it  does  not  require  calculation  of  any  deriva- 


tives of  higher  than  the  first  order. 

In  this  paper,  we  present  an  approach  to  solve  the 


orbit  problem  by  Jefferys'  method,  in  which  both  the  area 
theorem  and  the  conic  section  equation  assume  the  function 
of  the  condition  equations. 


CHAPTER  II 
REVIEW  AND  REMARKS 


Review  of  the  Methods  for  Orbit-Computation 
Every  complete  observation  of  a  double  star  supplies  us 
with  three  data:   the  time  of  observation,  the  position 
angle  of  the  secondary  with  respect  to  the  primary,  and  the 
angular  distance  (separation)  between  the  two  stars.   The 
problem  of  computing  the  so-called  orbital  elements  (in  this 
paper,  called  orbital  parameters)  of  a  visual  binary  from  a 
set  of  observations  superficially  appears  analogous  to  the 
case  of  orbits  in  the  planetary  system,  yet  in  practice 
there  is  little  resemblance  between  the  problems.   The 
problem  of  determining  the  orbit  of  a  body  in  the  solar 
system  is  complicated  by  the  motion  of  the  observer  who 
shares  the  motion  of  the  Earth,  so  that,  unlike  in  the  case 
of  a  binary,  the  apparent  path  is  not  merely  a  projection  of 
the  orbit  in  space  onto  the  celestial  sphere. 

In  the  case  of  a  binary,  the  path  observed  is  the 
projection  of  the  motion  of  the  secondary  round  the  primary 
onto  a  plane  perpendicular  to  the  line  of  sight.   The 
apparent  orbit  (i.e.,  the  observed  path  of  the  secondary 
about  the  primary)  is  therefore  not  a  mere  scale  drawing  of 
the  true  orbit  in  space.   The  primary  may  be  situated  at  any 


point  within  the  ellipse  described  by  the  secondary  and,  of 
course,  does  not  necessarily  occupy  either  a  focus  or  the 
center. 

The  problem  of  deriving  "an  orbit"  (meaning  a  set  of 
estimates  of  the  orbital  parameters)  from  the  observations 
was  first  solved  by  F.  Savary  in  1827.   In  1829,  J.  F.  Encke 
quickly  followed  with  a  different  solution  method  which  was 
somewhat  better  adapted  to  what  were  then  the  needs  of  the 
practical  astronomer.   Theoretically,  the  methods  of  Savary 
and  Encke  are  excellent.   But  their  methods  utilize  only 
four  complete  pairs  of  measures  (angle  and  distance)  instead 
of  all  the  available  data  and  closely  emulate  the  treatment 
of  planetary  orbits.   They  are  therefore  inadequate  in  the 
case  of  binary  stars  (W.  D.  Heintz,  1971;  R.  G.  Aitken, 
1935). 

Later,  Sir  John  Herschel  (1832)  communicated  a 
basically  geometric  method  to  the  Royal  Astronomical 
Society.   Herschel 's  method  was  designed  to  utilize  all  the 
available  data,  so  far  as  he  considered  them  reliable. 
Since  then,  the  contributions  to  the  subject  have  been  many. 
Some  consist  of  entirely  new  methods  of  attack,  others  of 
modifications  of  those  already  proposed.   Among  the  more 
notable  investigators  are  Yvon  Villarceau,  H.  H.  Madler,  E. 
F.  W.  Klinkerfues,  T.  N.  Thiele,  M.  Kowalsky,  S.  Glasenapp, 
H.  J.  Zwiers,  K.  Schwarzchild,  T.  J.  J.  See,  H.  N.  Russell, 
R.  T.  A.  Innes,  W.  H.  van  den  Bos  and  others. 


One  may  classify  the  various  methods  published  so  far 
into  "geometric"  methods,  which  are  those  that  enforce  only 
the  constraint  that  the  orbit  is  elliptical,  and  the 
"dynamical"  ones  which  enforce,  in  addition,  the  area 
theorem . 

The  geometric  treatment  initiated  by  J.  Herschel  peaked 
in  Zwiers'  method  (1896)  and  its  modifications,  e.g.  those 
of  H.  M.  Russell  (1898)  and  of  G.  C.  Comstock  (1918).   Every 
geometric  method  has  the  shortcoming  that  it  must  assume  the 
location  of  the  center  of  the  ellipse  to  be  known  while  it 
ignores  the  area  theorem  and  thus  fails  to  enforce  one  of 
the  constraints  to  which  the  observations  are  subject.   The 
growing  guantity  and  guality  of  observations  called  for 
suitable  computing  precepts,  and  the  successful  return  to 
dynamical  methods  began  with  van  den  Bos  (1926). 

Of  the  many  methods  for  orbit-computation  formulated, 
some  are  very  useful  and  applicable  to  a  wide  range  of 
problems,  e.g.  those  by  Zwiers,  Russell  and  those  by  Innes 
and  van  den  Bos. 

Zwiers'  method  (1896)  is  essentially  graphical  and 
assumes  that  the  apparent  orbit  has  been  drawn.   This  method 
is  therefore  useless  unless  the  apparent  ellipse  gives  a 
good  geometrical  representation  of  the  observations  and 
satisfies  the  law  of  areas,  and  thus  will  not  be  further 
described  here  since  we  are  primarily  concerned  with  the 
analytical  methods. 


In  the  following,  we  will  briefly  review  Kowalsky's 
method  and  that  by  Thiele  and  Innes. 
Definition  of  the  Orbital  Parameters 

Seven  parameters  define  the  orbit  and  the  space 
orientation  of  its  plane.   The  first  three  of  these  (P,T,e) 
are  dynamical  and  define  the  motion  in  the  orbit;  the  last 
four  (a,i,fl,w)  are  geometrical  and  give  the  size  and 
orientation  of  the  orbit.   The  parameters  are  defined 
somewhat  differently  from  those  for  the  orbits  of  planets 
and  comets. 

The  first  dynamical  parameter  P  is  the  period  of 
revolution,  usually  in  units  of  mean  sidereal  years;  n  is 
the  mean  (usually  annual)  angular  motion;  since  n=2n/P,  P 
and  n  are  equivalent.   The  second,  T,  is  the  epoch  of 
periastron  passage  (usually  expressed  in  terms  of  years  and 
fractions  thereof).   The  third,  e,  is  the  eccentricity  of 
the  orbital  ellipse. 

The  geometrical  parameter  a  is  the  angle  subtended  by 
the  semi -major  axis  of  the  orbital  ellipse  (usually 
expressed  in  units  of  arcseconds).   The  angle  i  is  the 
inclination  of  the  orbital  plane  to  the  plane  normal  to  the 
line  of  sight,  that  is,  the  angle  between  the  plane  of 
projection  and  that  of  the  orbit  in  space.  It  ranges  from  0C 
to  180°.   When  the  position  angle  increases  with  time,  that 
is,  for  direct  motion,  i  is  between  0°  and  90°;  for  retro- 
grade motion,  i  is  counted  between  90°  and  180°; 
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and  i  is  90°  when  the  orbit  appears  projected  entirely  onto 
the  line  of  nodes.   The  "node",  Q,  is  the  position  angle  of 
the  line  of  intersection  between  the  tangential  plane  of 
projection  and  the  orbital  plane.   There  are  two  nodes  whose 
corresponding  values  of  fi  differ  by  180°.   That  node  in 
which  the  orbital  motion  is  directed  away  from  the  sun  is 
called  the  ascending  node.   We  understand  Q,  which  ranges 
from  0°  to  360°,  to  refer  to  the  ascending  node.   Because  it 
is,  however,  one  of  the  peculiarities  of  the  orbit- 
determination  of  a  visual  binary  that  it  is  in  principle 
impossible — from  positional  data  alone — to  decide  whether 
the  node  is  the  ascending  or  descending  one  fi  may  be 
restricted  to  0°<Q<180°.   The  last,  u,  is  the  longitude  of 
the  periastron  in  the  plane  of  the  orbit,  counted  positive 
in  the  direction  of  the  orbital  motion  and  starting  at  the 
ascending  node.   It  ranges  from  0°  to  360°. 

These  definitions  are  adhered  to  throughout  our  work. 
Some  of  them  may  somewhat  differ  a  little  from  those  given 
by  previous  authors.   But  any  way  in  which  one  defines  them 
does  not  affect  the  principles  of  the  method  we  describe. 
The  Method  of  M.  Kowalsky  and  S.  Glasenapp 

This  old  method  was  first  introduced  by  J.  Herschel  in 
a  rather  cumbersome  form  and  is  better  known  in  its  more 
direct  formulation  by  M.  Kowalsky  in  1873  and  by 
S.  Glasenapp  in  1889.   R.  G.  Aitken  (1935)  gives  the 


detailed  derivation  of  the  formulae  in  his  textbook  The 
Binary  Stars. 

Kowalsky's  method  is  essentially  analytical.   It 
derives  the  orbit  parameters  from  the  coefficients  of  the 
general  equation  of  the  ellipse  which  is  the  orthogonal 
projection  of  the  orbit  in  space,  the  origin  of  coordinates 
being  taken  at  the  primary.   The  projected  orbit  can  be 
expressed  by  a  quadratic  in  x  and  y,  whose  five  coefficients 
are  related  to  those  five  orbital  parameters  which  do  not 
involve  time. 

In  rectangular  coordinates,  the  equation  of  an  ellipse, 
and  thus  of  a  conic  section,  takes  the  form 

cxx2  +  2c2xy  +  c3y2  +  2c4x  +  2c5y  +1=0   ,       (2-1) 

where  the  rectangular  coordinates  (x,y)  are  related  to  the 
more  commonly  directly  observed  polar  coordinates  (p,6)  by 
the  equations 

x  =  pcose   ,  (2-2a) 

y  =  psine   ,  (2-2b) 

where  p  is  the  measured  angular  distance  and  6  the  position 
angle . 
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The  five  coefficients  of  equation  (2-1)  can  be  deter- 
mined by  selecting  five  points  on  the  ellipse  or  by  all  the 
available  observations  in  the  sense  of  least  squares. 

There  is  an  unambiguous  relationship  between  the  five 
coefficients  (^,02,03,04,05)  and  the  five  orbital  para- 
meters (a,e,i,u,Q).   We  can  find  the  detailed  derivation  of 
the  formulae  in  Aitken's  The  Binary  Stars.   Here,  we  will 
therefore  state  only  the  final  formulae  without  derivation. 

The  five  orbital  parameters  (a,e,i,u>,n)  can  be  calcu- 
lated from  the  known  coefficients  (^,02,03,04,05)  by  the 
following  procedure. 
1)  The  parameter  Q  can  be  found  from  the  equation 

2(c2-c4c5) 
tan2Q  =  .  (2-3) 

(C52_C42+Crc3) 

To  determine  in  which  quadrant  Q   is  located,  we  can  use 
two  other  equations: 

c'sin2Q  =  2(C2-c4c5)   ,  (2-3'a) 

c'cos2Q  =  c52-C42+c1-c3   ,  (2-3'b) 


where 


4.   2. 
tan  1 


P2 


(2-3'c) 


which  is  always  positive. 
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More  elegantly,  we  write  in  Eichhorn's  notation, 

2Q  =   plg[2(c2-c4C5),c52-C42+c1-c3]   .       (2-3'd) 

H.  Eichhorn  (1985),  in  his  "Kinematic  Astronomy" 
(unpublished  lecture  notes),  defines  the  plg(x,y)  function 
as  follows: 

plg(x,y)  =  arctan(x/y)  +  90° [ 2-sgnx( 1+sgny) ]   , 

where  arctan  is  the  principal  value  of  the  arctangent. 
2)  The  inclination  i  is  found  from 


2    2 

2cosft(c5  +c4  -c1-c3) 

tan2i  =  -2  + (2-4) 

2,   2    _  s  ,22 


cos 


2Q(Cj  +c4  -c1-c3)-(c5  -c4  +c1-c3) 


Whether  the  i  is  in  the  first  or  second  quadrant  is  deter- 
mined by  whether  the  motion  is  direct  or  retrograde.   If  the 
motion  is  direct,  the  position  angle  increases  with  time, 
i<90°;  otherwise,  i>90°. 

3)  The  equation 

u  =  plg[l-(c4sinQ-C5COsQ)cosi,C4COsfi+C5sinQ)       (2-5) 

gives  the  value  of  w. 

4)  With  i,u),n  known,  two  more  parameters  (a  and  e)  can  be 
calculated  from 


2cos2fl ( c . sinQ-cccosQ ) cosi 
4       D 


e  = 


12 


(2-6) 


2    2  2    2 

[cos2Q(c5  +c-  -c--c3)-(c5  -c4  +c1c3)]sinu>   , 


2cos2Q 
a  = (2-7) 

[cos2Q(c52+c42cl-c3)-(c52-c42+cl-c3)](l-e2)   . 


5)  To  complete  the  solution  analytically,  the  mean  motion  n 
(or  the  period  P)  and  the  time  of  periastron  passage  T,  must 
be  found  from  the  mean  anomaly  M,  computed  from  the  observa- 
tions by  Kepler's  equation: 

M  =  n(t-T)  =  E-esinE   ,  (2-8) 

where  E  is  the  eccentric  anomaly.   Every  M  will  give  an 
equation  of  the  form 

M  =  nt  +  T   ,  (2-9) 

where  r=-nT. 

From  these  equations  the  values  of  n  and  T  are  computed 
by  the  method  of  least  squares. 

This  is  essentially  the  so-called  Kowalsky  method.   It 
looks  mathematically  elegant.   But  because  it  is  not  only 
severely  affected  by  uncertainties  of  observations  but  also 
ignores  the  area  theorem,  it  has  a  very  poor  reputation 
among  seasoned  practitioners.   However,  in  our  work,  we  use 
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it  for  getting  the  initial  approximation  to  the  solution. 
It  serves  this  purpose  very  well. 

The  Method  of  T.  N.  Thiele,  R.  T.  A.  Innes  and 
W.  H.  van  den  Bos 

T.  N.  Thiele  (1883)  published  a  method  of  orbit 
computation  depending  upon  three  observed  positions  and  the 
constant  of  areal  velocity. 

The  radii  vectors  to  two  positions  in  an  orbit  subtend 
an  elliptical  sector  and  a  triangle,  the  sector  being 
related  to  the  time  interval  through  the  law  of  areas. 
Gauss  introduced  the  use  of  the  ratio:   "sector  to  triangle" 
between  different  positions  into  the  orbit  computation  of 
planets,  and  Thiele  applied  the  idea  to  binary  stars. 
Although  the  method  could  have  been  applied  in  a  wide  range 
of  circumstances,  it  became  widely  used  only  after  Innes  and 
van  den  Bos  revived  it. 

In  1926,  R.  T.  A.  Innes  (Aitken,  1935),  seeking  a 
method  simpler  than  those  in  common  use  for  correcting  the 
preliminary  parameters  of  an  orbit  differentially,  indepen- 
dently developed  a  method  of  orbit  computation  which  differs 
from  Thiele' s  in  that  he  used  rectangular  instead  of  polar 
coordinates.   W.  H.  van  den  Bos's  (1926,  1932)  merit  is  not 
merely  a  modification  of  the  method  (transcribing  it  for  the 
use  with  Innes  constants)  but  chiefly  in  its  pioneeringly 
successful  application.   The  device  became  most  widely 
applied.   Briefly,  the  method  of  computation  is  as  follows. 
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The  computation  utilizes  three  positions  (p^,6-j_)  or  the 
corresponding  rectangular  coordinates  (x^,yj_)  at  the  times 
t^  (i=l,2,3).   The  area  constant  C  is  the  seventh  guantity 
reguired.   Thiele  employs  numerical  integration  to  find  the 
value  of  C. 

First,  from  the  observed  data,  find  the  guantities  L  by 

L12  =  t2"tl"D12/c  i  (2-10a) 

L23  =  t3_t2"D23/c   '  (2-10b) 

L13  =  t3-t1-D13/C   ,  (2-10c) 

where  Dj_j  =  pj_pjsin(  6j-6j_) ,  are  the  areas  of  the  corre- 
sponding triangles. 

Then,  from  the  eguations 

nL^2  =  p-sinp   ,  (2-lla) 

nL23  =  g-sing   ,  (2-llb) 

nL13  =  (p+g)-sin(p+g)   ,  (2-llc) 

the  guantities  n,  p  and  g  can  be  found  by  trials,  for  in  the 
three  eguations  above  there  are  only  three  unknowns,  i.e., 
n,  p  and  g.   The  eccentric  anomaly  E2  and  the  eccentricity  e 
can  thus  be  computed  from 

(D—sinp-D..  -,sing) 

esinE,,  =  (2-12a) 

(D23+D12-D13) 
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(D_-.cosp+D-  -Cosq-D-  ^ ) 

ecosE-  =  (2-12b) 

(D23+D12-D13) 


After  E2  and  e  are  obtained,  the  two  other  eccentric 
anomalies  E]_  and  E3  can  be  found  from 

E-l  =  E2  -  P   ,  (2-13a) 

E3  =  E2  +  q   •  (2-13b) 

The  Ej_  are  used  first  to  compute  the  mean  anomalies  Mj_  from 

equations  (2-8),  which  lead  to  three  identical  results  for 

T  as  a  check,  and  second  to  compute  the  coordinates  Xj_  and 
Yj_  from 

Xj_  =  cos  Ej_  -  e   ,  (2-14a) 

Yi  =   sinEiN/(l-e2)   .  (2-14b) 

By  writinq 

xi  =  KX-i   +  FYi   '  (2-15a) 

Yi  =  BXi  +  GYj_   ,  (2-15b) 

the  constants  A,  B,  F,  G  are  obtained  from  two  positions, 
the  third  aqain  servinq  as  a  check.   These  four  coeffi- 
cients, A,  B,  F,  G,  are  the  now  so-called  Thiele-Innes 
constants.   In  addition  to  n,  T,  e,  which  have  already  been 
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determined,  the  other  four  orbital  parameters  a,  i,  Q,    w  can 
be  calculated  from  A,  B,  F,  G  through 

Q+o)  =  plg(B-F,A+G)   ,  (2-16a) 

Q-Ui   =   plg(B+F,A-G)   ,  (2-16b) 


-i    (A-G)cos(w+oj      (B+F)sin(u+Q) 

tan  -  =  = (2-16c) 

2    (A+G)cos(a)-oj      (B-F)sin(w-Q)   , 
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A+G  =  2acos(io+Q)cos  -  (2-16d) 


In  addition  to  the  brief  introduction  to  this  method  given 
above,  a  detailed  description  of  it  can  be  found  in  many 
books,  e.g.,  Aitken's  book  The  Binary  Stars  (193  5)  and  W.  D. 
Heintz's  book  Double  Stars  (1971). 

A  more  accurate  solution  is  obtained  by  correcting 
differentially  the  preliminary  orbit  which  was  somehow 
obtained  by  using  whatever  method.   This  correction  can  be 
achieved  by  a  least  squares  solution. 

The  Method  of  Least  Squares 
The  method  of  least  squares  was  invented  by  Gauss  and 
first  used  by  him  to  calculate  orbits  of  solar  system  bodies 
from  overdetermined  system  of  equations.   It  is  the  most 
important  tool  for  the  reduction  and  adjustment  of  observa- 
tions in  all  fields,  not  only  in  astronomy.   However,  the 
traditional  standard  algorithm,  which  employs  condition 
equations  that  are  solved  with  respect  to  the  (uncorrelated) 
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observations  and  either  linear  in  the  adjustment  parameters 
or  linearized  by  developing  them  in  Taylor  series  which  are 
broken  off  after  the  first  order  terms,  is  inadequate  for 
treating  the  problem  at  hand.  An  algorithm  for  finding  the 
solution  of  a  more  general  least  squares  adjustment  problem 
was  given  by  D.  C.  Brown  (1955).  This  situation  may  briefly 
be  described  as  follows. 

Let  {x}  be  a  set  of  quantities  for  which  an  approxima- 
tion set  {xq}  was  obtained  by  direct  observation.   By 
ordering  the  elements  of  the  sets  {x}  and  {xQ}  and  regarding 
them  as  vectors,  x  and  xq,  respectively,  the  vector  v=x-xq 
is  the  vector  of  the  observational  errors  which  are 
initially  unknown.   Assume  that  they  follow  a  multivariate 
normal  distribution  and  that  their  covariance  matrix  a  is 
regarded  as  known.   Further  assume  that  a  set  of  parameters 
{a}  is  ordered  to  form  the  vector  a.   The  solution  of  the 
least  squares  problem  (or  the  adjustment)  consists  in 
finding  those  values  of  the  elements  of  {x}  and  {a}  which 
minimize  the  quadratic  form  vTa"-'-v  while  at  the  same  time 
rigorously  satisfying  the  condition  equations 

fi({x}i,{a}i)=0    .  (2-17) 

This  is  a  problem  of  finding  a  minimum  of  the  function 
vTa~lv  subject  to  condition  equations.   A  general  rigorous 
and  noniterative  algorithm  for  the  solution  exists  only  for 
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the  case  that  the  elements  of  {x}  and  {a}  occur  linearly  in 
the  functions  fj_.  When  f^  are  nonlinear  in  the  elements  of 
either  {x}  or  {a},  or  both,  equations  which  are  practically 
equivalent  to  the  f  j_  and  which  are  linear  in  the  pertinent 
variables  can  be  derived  in  the  following  way. 

Define  a  vector  6  by  a=aQ+6.   The  condition  equations 
can  then  be  written 


f(x0+v,  a0+6)  =  0 


(2-18) 


where  the  vector  of  functions  f=(f-L,f2  •••,  fm)T'  m  being 
the  number  of  equations  for  which  observations  are 
available.   Now  assume  that  all  elements  of  {v}  and  {6}  are 
sufficiently  small  so  that  the  condition  equations  can  be 
developed  as  a  Taylor  series  and  written 


f0  +  fxv  +  fa6  +0(2) 


=  0   , 


(2-18'a) 


where  fQ  =  f(x0,aQ),  fx  = 


rdf\ 

'df\ 

,3xJ 

and  f     = 
a 

x0'a0 

,3aJ 

xo,ao 

If  the  small  quantities  of  order  higher  than  1  can  be 
neglected,  we  can  write  the  linearized  condition  equations 
as 


f 0  +  fxv  +  fa6  =  0   . 


(2-18'b) 
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These  are  linear  in  the  relevant  variables,  which  are  the 
components  of  v  and  of  6. 

In  order  to  satisfy  the  conditions  (2-18'b),  we  define 
a  vector  -2y.  of  Lagrangian  multipliers  and  minimize  the 
scalar  function 

S(V,6)  =  vTo_1v  -  2u(f0  +  fxv  +  fa6)  (2-19) 

in  which  the  components  of  v  and  6  are  the  variables. 
Setting  the  derivatives  (8S/8v)  and  (3S/86)  equal  to  zero 
and  considering  equations  (2-18'b),  we  obtain 

6  =  -  [faT(fxafxT)fa^"lfaT(fxCTfxT)"lfO   '       (2-20a) 

U  =  -  (fxafxT)-1(fa6+f0)   ,  (2-20b) 

v  =  afxTy.   ,  (2-20c) 

where  we  have  assumed  that  fxafxT  is  nonsingular.   This  is 
the  case  only  if  all  equations  fi=0  contain  at  least  one 
component  of  x;  i.e.,  if  there  are  no  pure  constraints  of 
the  form  g^(a)=0.   This  case  (which  we  shall  not  encounter 
in  our  investigations)  is  discussed  below  in  the  description 
of  Jefferys'  method. 

Note  that  in  constructing  the  scalar  function  S  in 
expression  (2-19),  first  order  approximations  have 
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been  used.   In  some  cases,  the  linearized  representation  of 
Eq.  (2-18)  by  Eq.  (2-18'b)  is  not  accurate  enough.   In  some 
of  these  cases,  the  convergence  toward  the  definitive 
solution  may  be  accelerated  and  sometimes  be  brought  about 
by  a  method  suggested  by  Eichhorn  and  Clary  (1974)  when  a 
strictly  linear  approach  would  be  divergent.   Their  solution 
algorithm  takes  into  account  the  second  (as  well  as  the 
first)  order  derivatives  in  the  adjustment  residuals 
(observational  errors)  and  the  corrections  to  the  initially 
available  approximations  to  the  adjustment  parameters.   They 
pointed  out  that  the  inclusion  of  second  order  terms  in  the 
adjustment  residuals  is  necessary  whenever  the  adjustment 
residuals  themselves  cannot  be  regarded  as  negligible  as 
compared  to  the  adjustment  parameters,  in  which  cases  the 
conventional  solution  techniques  would  not  lead  to  the 
"best"  approximations  for  the  adjustment  parameters  in  the 
sense  of  least  squares.   The  authors  modified  the  condition 
equations  as 


MS 

fvv    fa6  +  Vv  +  D6  +  or  =  0   .  (2-18") 

Nv 


Correspondingly,  the  scalar  function  to  be  minimized  becomes 


N6 

S"(v,6)  =  vTa_1v  -  2u(f0+fxv+fa6+Vv+D6+or)   .   (2-19") 

Nv 
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Here,  the  i-th  line  of  the  matrix  D  is  -a   E. ,  and 

2   x 


E.  = 

1 


3  fi 


(2-21) 


x0,a0 


the  matrix  of  the  Hessean  determinant  of  f  j_  with  respect  to 
the  adjustment  parameters.   Similarly,  the  i-th  line  of 


vector  V  is  — vTW^,  and 
2 


r   -2 


w.  = 

1 


s  £i  1 

3Xj3xky 


x0 ' a0   ; 


(2-22) 


the  i-th  line  of  M  is  vTHj_,  and 


H.  = 

l 


8zfi 

,  oX  ■  o 3.t  . 


x0'a0 


(2-23) 


and  that  of  N  is  6TH^,  so  evidently  M6=Nv. 
Minimizing  S",  also  6,  v  can  be  calculated. 

For  this  algorithm  in  detail,  one  can  refer  to  the 
original  papers  of  Eichhorn  and  Clary.   The  notation  used 
here  is  slightly  different  from  the  original  one  used  by  the 
authors . 

Jefferys  (1980,  1981)  proposed  an  even  more  accurate 
algorithm  which  also  improves  the  convergence  of  the 
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conventional  least  squares  method.   Furthermore,  his  method 
is  nearly  as  simple  to  apply  in  practice  as  the  classical 
method  of  least  squares,  because  it  does  not  require  any 
second  order  derivatives.   Jefferys  defines  the  scalar 
function  to  be  minimized  as 


S  =  -vTa  1v  +  fT(x,a)il  +  gT(a)  y   ,  (2-24) 


where  x=Xq+v;  x,£  are  the  current  "best"  approximations  to  x 
and  a;  and  g  is  another  vector  function  consisting  of  the 
constraints  on  the  parameters;  p.,  y  are  vectors  of 
Langrangian  multipliers  and  evaluated  at  (x,  a).   Minimizing 
S  with  respect  to  x  and  a,  he  arrives  at  the  normal  equa- 
tions 

a_1v  +  fxT(x,a)ii  =  0  ,                       (2-25a) 

fxT(x,a)u  +  gaT(a)y  =  0   ,                    (2-25b) 

f(x,a)  =  0   ,  (2-25c) 

g(a)  =  0   ,  (2-25d) 

where 


*x  ::  ^xL  .   ^a  ::  ^a. 


These  equations  are  exact  and  involve  no  approximations. 
This  is  the  significant  difference  between  Jefferys'  method 
and  those  of  Brown  and  of  Eichhorn  and  Clary. 
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Remarks 

As  mentioned  before,  Kowalsky's  method  will  most  likely 
not  produce  the  best  obtainable  results,  because  the 
relative  observed  coordinates  (x,  y,  t)  are  subjected  only 
to  the  condition  (2-1),  which  involves  only  five  of  the 
seven  necessary  orbital  parameters  as  adjustment  parameters 
and  does  not  enforce  the  area  theorem.   It  can  therefore 
never  be  used  for  a  definitive  orbit  determination  since  it 
completely  ignores  the  observation  epochs. 

Yet,  Eg.  (2-1)  has  the  advantage  that  it  appears  to  be 
simple,  in  particular  it  is  linear  in  the  adjustment 
parameters  albeit  not  in  the  observations.   When  it  is  used 
for  the  determination  of  orbits,  the  right-hand  sides  of  the 
eguations  which  result  from  inserting  a  pair  (x,  y)  of 
observed  rectangular  coordinates  into  Eg.  (2-1)  are  regarded 
as  errors  with  a  univariate  Gaussian  distribution  (i.e.,  as 
normally  distributed  errors).   One  may  then  perform  a  least 
sguares  adjustment  which  is  linear  in  the  adjustment 
parameters.   As  Eichhorn  (1985)  pointed  out,  this  approach, 
while  it  has  the  advantage  that  approximation  values  for  the 
adjustment  parameters  need  not  be  available  at  the  outset, 
fails  to  take  into  account  two  facts. 

1)  It  is  not  the  right-hand  sides  of  the  condition  eguations 
which  are  to  be  considered  as  normally  distributed  errors, 
but  rather  the  observations  (x,y)  or  (p,6).   The  condition 
eguations  (2-1)  thus  contain  more  than  one  observation  each. 
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Since  the  observations  occur  in  the  condition  equations 
nonlinearly,  the  matrix 


f  = 
x 


(3f(x,an 


9x 


x-x0 , a-aQ 


must  be  found.   This  requires  knowledge  of  approximate 
values  ag  for  a.   Approximate  values  xq  for  x  are 
available--they  are  the  observations  themselves. 
Approximate  values  ag  for  a  may  sometimes  indeed  be  obtained 
in  the  classical  way  by  regarding  the  right-hand  sides  of 
the  condition  equations  as  normally  distributed  errors.   In 
addition,  it  should  also  be  taken  into  account  that  the 
covariance  matrix  a  of  the  observations  is  not  necessarily 
diagonal. 

2)  In  some  cases,  especially  when  the  binary  under  study  is 
very  narrow,  the  errors  of  the  observations  are  not 
negligibly  small  compared  with  the  adjustment  parameters. 
This  requires  either  that  second-order  terms  in  the  observa- 
tional errors  v  be  carried  in  the  equations  or,  as  Jefferys 
has  pointed  out,  that  iterations  be  performed  using  in  the 
evaluation  of  the  matrices  fx  and  fa  not  only  improved 
approximations  for  a  but  also  improved  values  for  the 
observed  quantities  as  they  become  available. 

If  Kowalsky's  methods  were  so  modified,  the  algorithm 
would  yield  better  values  for  the  adjustment  parameters  a 
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than  the  traditional  approach.   Either  way,  one  can  usually 
find  an  initial  approximation  by  Kowalsky's  method. 

With  respect  to  both  the  theoretical  clarity  and  the 
practical  applicability,  as  far  as  it  is  concerned,  the 
Thiele-Innes-van  den  Bos  method  leaves  nothing  to  be 
desired.   However,  the  three  places  selected,  even  when 
smoothed  graphically  or  by  some  computation,  may  not  suffice 
to  describe  the  motion  with  sufficient  accuracy,  so  that 
large  and  systematic  residuals  may  remain,  particularly  near 
periastron.   The  method  is  seriously  inadequate  even  if  one 
of  the  ratios  sector  to  triangle  is  very  close  to  1  and  thus 
strongly  affected  by  the  measurement  errors  or  if  the  area 
constant  C  is  not  initially  known  to  the  required  accuracy. 
The  computation  may  then  produce  an  erroneous  orbit  with 
spuriously  high  eccentricity,  perhaps  a  hyperbolic  one,  or 
no  solution  at  all.   And  obviously,  different  combinations 
of  the  three  positions  selected  from  a  set  of  observations 
will  not  likely  give  the  same  result.   This  method  therefore 
fails  to  use  the  information  contained  in  the  observations 
in  the  best  possible  way. 

In  our  work  we  try  to  present  a  fairly  general  least 
squares  algorithm  to  solve  the  orbit  problem.   We  shall 
adopt  Jefferys'  least  squares  method  as  our  basic  approach. 


CHAPTER  III 
GENERAL  CONSIDERATIONS 


This  chapter  contains  a  general  discussion  of  the  least 
squares  orbit  problem.   We  shall  set  up  condition  equations 
which  simultaneously  satisfy  the  ellipse  equation  and  the 
area  theorem. 

We  have  seen  that  it  is  not  sufficient  to  use  Eq.  (2-1) 
as  the  only  type  of  condition  equation  because  this  would 
ignore  the  observing  epochs,  cf.  last  chapter.   Completely 
appropriate  condition  equations  must  explicitly  contain  the 
complete  set  of  the  seven  independent  orbital  parameters  as 
the  adjustment  parameters.   Also,  to  be  useful  in  practice, 
they  must  impose  both  the  geometric  and  dynamical  condi- 
tions, and  must  lead  to  a  convergent  sequence  of  iterations. 

After  the  condition  equations  are  established,  we 
present  the  general  outline  of  the  algorithm  which  solves 
the  orbit  problem  by  Jefferys'  method  of  least  squares. 

We  also  discuss  some  further  suggestions  for  obtaining 
the  initial  approximate  solution  required  for  the  least 
squares  algorithm. 
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The  Condition  Equations 
Assume  that  a  set  of  observations  {xq}  was  obtained 
consisting  of  complete  data  triples  (t,  p,  6),  which 
measure  the  positions  of  the  fainter  component  (secondary) 
with  respect  to  the  brighter  one  (primary):   the  position 
angle  6  is  counted  counterclockwise  from  North  and  ranges 
from  0°  to  360°;  the  angular  separation  p  (also  called 
distance)  is  usually  given  in  seconds  of  arc,  and  t  is 
the  observing  epoch.   The  conversion  of  (p,  6)  to 
rectangular  coordinates  (x,  y)  in  seconds  of  arc  is  as 
following: 

Declination  difference       ^c"^p  =  x  =  Pcos6/       (3-la) 
Right  ascension  difference    (ac-Op)cos6  =  y  =  psine,  (3-lb) 

where  6C,  ac  are  the  declination  and  right  ascension, 
respectively,  of  the  secondary;  6p,  a—  those  of 
primary. 

Equivalently,  the  observations  can  also  be  regarded 
as  relative  coordinates  (t,  x,  y)  of  the  secondary  with 
respect  to  the  primary. 

It  might  be  worthwhile  to  point  out  that  1)  even  though 
the  formulae  (3-1)  are  approximations  valid  only  for  small 
values  of  p,  they  may  be  regarded  as  practically  rigorous 
for  binaries;  2)  we  are  following  the  custom  in  double  star 
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astronomy  by  having  the  x-axis  along  the  colure*  and  the  y- 
axis  tangential  to  the  parallel  of  declination. 

All  observations  in  {x0>  are  affected  by  observational 
errors.   Let  {x}  be  the  set  of  the  true  values  of  the 
observations,  that  is,  those  values  the  observations  would 
have  had  if  there  were  no  observational  errors.   By  ordering 
the  elements  of  the  sets  {xq}  and  {x}  and  regarding  them  as 
vectors,  Xq  and  x  respectively,  we  have  seen  that  we  may 
write  the  vector  of  observational  errors  as  v  =  x  -  Xq. 
These  errors  are  of  course  unknown,  but  as  mentioned  already 
in  the  last  chapter,  we  assume  that  they  follow  a  multi- 
variate normal  distribution  with  known  covariance  matrix. 
For  visual  binaries,  the  relative  orbit  must  be  an  ellipse 
(strictly  speaking,  a  conic  section)  in  space  as  well  as  in 
projection.   All  pairs  (x,  y)  must  therefore  satisfy  the 
condition  eguations  (2-1): 

C^x2  +  2C2xy  +  C3y2  +  2C4x  +  2C5y  +1=0   , 

which  implicitly  involve  five  of  the  seven  orbital  para- 
meters but  do  not  enforce  the  area  theorem. 

The  well-known  relationships  between  the  five  coeffi- 
cients (Clf  C2,  C3,  C4,  C5)  in  Eg.  (2-1)  and  the  five 


Following  Eichhorn's  terminology  who  uses  the  term  "colure" 
generally  for  any  locus  of  constant  right  ascension. 
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orbital  parameters  (e,  a,  i,  co,  fi)  by  way  of  the  Thiele- 
Innes  constants,  have  been  discussed  in  the  last  chapter. 

Consider  a  right-handed  astrocentric  coordinate  system 
K  whose  X-Y  plane  is  the  true  orbital  plane  such  that  the 
positive  X-axis  points  toward  the  periastron  (of  the 
secondary  with  respect  to  the  primary) .   The  positive  Y-axis 
is  obtained  by  rotating  the  X-axis  by  90°  on  the  Z-axis  in 
the  direction  of  the  orbital  motion.   The  axes  of  a  second 
astrocentric,  right-handed  coordinate  system  k  are  parallel 
to  those  of  the  eguator  system  Q.   The  two  systems  K  and  k 
are  related  by  the  transformation 


xK  =  R3(w)R1(i)R3(fi)Xk 


(3-2) 


From  the  theory  of  the  two-body  problem  we  know  that, 
in  the  system  K,  the  coordinates  of  the  secondary  with 
respect  to  the  primary  are  given  by 


xK  = 


x  ■ 

Y 
Z  J 


-   a 


fcosE  -  e  ^ 

sinEVl-e 
n 


0 


(3-3) 


where  E  is  the  eccentric  anomaly,  which  is  the  solution  of 
Kepler's  equation 


n(t-T)  =  E-esinE 


(3-4) 
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Here,  n  and  T,  the  mean  motion  and  the  periastron  epoch,  are 
the  two  orbital  parameters  not  involved  in  Eq.  (2-1). 
From  Eq.  (3-2)  we  obtain 


XK  = 


(A 

B 

C> 

fX> 

F 

G 

H 

y 

,K 

L 

M, 

s.K 

(3-5) 


where 


Kx  +  Ly 


M 


(3-6) 


and 


ABC' 
F   G   H 
K   L   M 


=  R3(w)R1(i)R3(n: 


(3-7) 


or,  in  detail, 

A  =  cosQcosco  -  sinQsinoocosi 

B  =  sinficosoj  +  cosQsinoocosi 

C  =  sinusini   ; 

F  =   -cosfisinu)   -   sinScosoocosi 

G  =   -sinflsinto  +  cosQcosuicosi 

H  =  coscosini      ; 

K  =  sinQsini      ; 


(3-8a) 
(3-8b) 
(3-8c) 
(3-8d) 
(3-8e) 
(3-8f) 
(3-8g) 
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L  =  -cosQsini   ; 
M  =  cosi   , 


(3-8h) 
(3-8i) 


where  in  this  notation,  the  traditional  Thiele-Innes 
constants  would  be  aA,  aB,  aF  and  aG. 

From  Eq.  (3-7)  and  (3-5)  we  can  get 


X  =  Ax  +  By  +  Cz  = 


Gx  -  Fy 


M 


(3-9a) 


Y  =  Fx  +  Gy  +  Hz  =  - 


Bx  -  Ay 


M 


(3-9b) 


Thus,  we  see  that  X  and  Y  can  be  expressed  in  terms  of  i,oj,J2 
and  the  observations  (x,  y) . 
From  Eq.  (3-3)  we  obtain 


X 

COSE 

sinE 

- 

-  +  e 

a 

Y 

a/l-e2 

(3-10a) 


(3-10a) 


Combining  (3-10)  with  Kepler's  equation  (3-4),  we  get 


X 

—  +  e  =  cos 

a 


eY 


n(t-T)  + 


aVl-e2 


(3-lla) 
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=  sin 


eY 


ajl-e2 

More  succinctly,  we  have 


n(t-T)  + 


ijl^i 


(3-llb) 


U  =  cos[  n(t-T)  +  eV  ]   , 
V  =  sin[  n(t-T)  +  eV  ]   , 


or 


U  =  cos[  n(t-T)  +  eJl-U2    ] 
V  =  sin[  n(t-T)  +  eV  ] 


with 


X 
U  =  -  +  e 


aTl^? 


(3-12a) 
(3-12b) 

(3-12'a) 
(3-12'b) 


(3-12c) 


After  X  and  Y  are  expressed  in  terms  of  i,  «,  Q  and  (x,  y) 
as  in  Eqs.  (3-9),  Egs.  (3-11)  or  (3-12)  involve  exactly  the 
seven  orbital  parameters  (n,  T,  a,  e,  i,  w,  Q)  and  the 
observations  (t,  x,  y) .   The  observing  epoch  t  now  appears 
explicitly,  as  it  must  if  the  area  theorem  is  to  be 
enforced. 

Now,  we  see  that  if  both  equations  (3-11)  are  satis- 
fied, Kepler's  equation  which  enforces  area  theorem  would  be 
satisfied  and  furthermore,  the  ellipse  equation  would  also 
be  automatically  satisfied  as  can  be  seen  if  t  is  eliminated 
from  Eqs.  (3-11)  so  that  these  equations  are  reduced  to  one 
equation  in  X  and  Y.   If  we  select  the  two  equations  (3-11) 
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as  the  condition  equations,  we  need  no  longer  carry 
Eq.  (2-1)  separately.   Of  course,  we  can  use  any  one  of 
Eqs.  (3-11)  as  well  as  Eq.  (2-1)  as  the  condition  equations. 
However,  using  Eq.  (2-1)  is  not  convenient,  for  it  contains 
the  five  coefficients  directly,  but  not  the  five  orbital 
parameters  themselves,  even  though  there  are  unique  rela- 
tionships between  them.   Ideally,  the  condition  equations 
should  have  the  adjustment  parameters  explicitly  as 
variables. 

In  our  work,  we  use  the  Eqs.  (3-11)  as  the  complete  set 
of  condition  equations. 

The  General  Statement  of  the  Least 
Squares  Orbit  Problem 

As  seen  in  the  last  section,  the  vector  of  "true 

observations"  x  (presumably  having  been  adjusted  from  the 

observation  xq  by  the  residuals  v)  and  the  vector  of  "true 

orbital  parameters"  a,  a=(n,  T,  a,  e,  i,  u,  Q)T,  must 

satisfy  the  condition  equations 


X 
f1{xQ+v,a.)    =  —  +   e  -  cos 

3. 


eY 


n(t-T)+ 


ifl-( 


=  0   ,      (3-13a) 


f2(x0+v,a)  = 


/T~2 
aVl-e 


-  sin 


eY 


n(t-T)+ 


S- 


aJl-e 


=    0 


(3-13b) 


where 


with 


and 


r  x  1 

r    A 

B 

C    ^ 

'   x   * 

Y 

= 

F 

G 

H 

Y 

I  o  , 

,    K 

L 

H   , 

,    z    > 

z  =  - 


Kx  +  Ly 


M 
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ABC' 
F   G   H 
K   L   M 


R3(o))R1(i)R3(n) 


In  our  problem,  there  are  no  constraints  between  the 
parameters  which  involve  no  observations  so  that  Jefferys'  g 
function  does  not  occur.   The  problem  can  therefore  be 
stated  as  follows. 

Assume  that  the  residuals  {v}  (regarded  as  vector  v)  of 
a  set  of  observations  {xg}  (regarded  as  vector  xg)  follow  a 
multivariate  normal  distribution,  whose  covariance  matrix  a 
is  regarded  as  known;  we  are  to  find  the  best  approximations 
of  v  (for  v,  the  residuals)  and  a  (for  a,  the  parameters) 
such  that 

f;L(xo+v,a)  =  0 
and 

f2(x0+v,a)  =  0 
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are  both  satisfied  while  at  the  same  time  the  quadratic  form 


s  =  _  vTa_1v  (3-14) 

U   2 


is  minimized. 

Following  the  well-known  procedure  introduced  by 
Lagrange,  the  solution  is  obtained  by  minimizing  the  scalar 
function 


S  =  -  vTa_1v  +  fT(x,a)y.  ,  (3-15) 


where  x  =  xg  +  v,  and  y.  is  the  vector  of  Lagrangian  multi- 
pliers, together  with  satisfying  the  equations  fi=0=f2. 
Denoting  the  matrix  of  partial  derivatives  with  respect  to  a 
variable  by  a  subscript,  this  is  equivalent  to  solving  the 
following  normal  equations: 

a_1v  +  f£T(x,a)ii  =  0  (3-16a) 

faTU  =  0   ,  (3-16b) 

f(x,a)  =  0   .  (3-16c) 

We  have  stated  before  that  these  equations  are  exact  and 
therefore  involve  no  approximations.   Before  Jefferys,  all 
authors  used  first  order  or  second  order  approximations  in 
forming  S  in  equation  (3-15).   This  is  the  significant 


36 

difference  that  distinguishes  Jefferys1  method  from  those 
employed  by  previous  authors. 

It  is  evident  that  the  solution  of  the  equations  (3-16) 
would  solve  the  posed  problem. 

About  the  Initial  Solution 
The  least  squares  algorithm  requires  an  initial 
solution  as  starting  point.   Any  approach  which  leads  to 
approximate  values  of  the  orbital  parameters  serves  this 
purpose,  because  our  algorithm  does  not  require  a  very 
accurate  initial  approximation.   As  long  as  the  initial 
approximation  is  not  too  different  from  the  final  result, 
convergence  can  always  be  achieved.   To  obtain  an  initial 
solution,  the  following  procedures  may  lead  to  an  initial 
solution. 

1)  As  mentioned  in  Chapter  II,  Kowalsky's  method  can  produce 
a  preliminary  solution.   Inserting  the  pairs  (x,  y)  of 
observed  rectangular  coordinates  into  Eq.  (2-1),  we  have  a 
set  of  linear  equations  in  which  the  five  coefficients  (c^, 
C2,  C3,  c4,  C5)  are  the  unknowns.   By  making  a  classical 
least  squares  solution  based  on  these  linear  equations,  the 
five  coefficients  can  be  computed.   An  estimate  of  the  five 
parameters  (a,  e,  i,  u,  o.)  can  be  obtained  in  turn  from  the 
unique  relationships  between  them  and  the  five  coefficients. 
The  remaining  two  parameters  (n,  T)  also  can  then  be 
calculated  from  the  known  quantities  simply  by  classical 
least  squares,  as  described  in  Chapter  II. 
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As  Eichhorn  (1985)  has  pointed  out,  it  is  of  course 
better  to  use  the  modified  Kowalsky  method.   Using  (2-1)  as 
condition  equations  and  the  five  coefficients  as  the 
adjustment  parameters,  one  may  iterate  by  Jefferys' 
algorithm  toward  the  best  fitting  adjustment  parameters  and 
the  best  corrections  to  the  observations,  v,  that  is,  to 
arrive  at  the  values  of  a  and  v  which  minimize  the  scalar 
function  Sg  (see  equation  3-14)  while  simultaneously 
satisfying  the  condition  equations. 

This  method  is  simple  to  apply  in  practice.   But 
unfortunately,  especially  when  the  observations  are  not  very 
precise,  the  five  coefficients  (C]_,  c2,  C3,  c4,  C5)  in  some 
cases  do  not  always  satisfy  the  conditions  for  an  ellipse; 
i.e.,  they  do  not  meet  the  requirements 

C1>0,  C3>0   and   C1C3-C22>0. 

However,  in  these  cases,  it  does  not  mean  that  there  is  no 
elliptic  solution  at  all  and  other  approaches  can  be  tried. 
When  this  happens,  one  may  for  instance  take  the  approach 
outlined  in  Chapter  23  of  Lawson  and  Harsson  (1974). 
2)  By  using  some  selected  points  among  the  observation  data 
instead  of  using  all  the  data  points,  sometimes  a  solution 
can  be  found  by  Kowalsky' s  method.   Such  a  solution  is 
usually  also  good  enough  to  be  a  starting  approximation. 


38 

3)  Or,  carefully  selecting  three  points  among  the  observa- 
tion data,  one  may  use  the  Thiele-Innes-van  den  Bos  method 
to  calculate  an  initial  approximation.   The  method  has  been 
described  in  Chapter  II. 


CHAPTER  IV 
THE  SOLUTION  BY  NEWTON'S  METHOD 

The  Solution  by  Newton's  Method 
In  our  problem,  the  normal  equations  (3-16)  are 
nonlinear.   They  must  be  solved  by  linearization  and 
successive  approximations.   Assume  that  approximate  initial 
estimates  of  the  unknowns  in  the  normal  equations  have 
somehow  been  obtained  (using  whatever  methods).   This 
initial  approximation  may  be  improved  by  Newton's  method, 
which  consists  of  linearizing  the  normal  equations  about  the 
available  solution  by  a  first  order  development  and 
obtaining  an  improved  solution  by  solving  the  linearized 
equations.   This  process  is  iterated  with  the  hope  that 
convergence  to  a  definite  solution  would  eventually  be 
obtained.   This  expectation  is  reasonable  if  the  initial 
approximation  is  sufficiently  close  to  the  final  solution 
and  if  the  observational  errors  are  not  too  large. 

Following  Jefferys'  notation  (which  is  also  the 
notation  we  have  used  in  Chapter  III),  let  the  initial 
approximate  solution  (and  also  the  current  approximate 
solution  during  iteration)  be  given  by  (x,a),  where 
x  =  x0  +  v,  x0  being  the  vector  of  observation,  v  the  vector 
of  observational  errors  (for  which  we  adopt  the  nullvector 
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as  initial  approximation);  a  is  the  initial  approximation  of 
the  vector  of  the  seven  orbital  parameters  (adjustment 
parameters);  also  let  the  corrections  to  both  x  and  a  be 
denoted  by  e  and  6,  respectively. 

The  normal  equations  in  our  problem  now  become 


a_1(v+e)  +  f£Tu  =  0   ,  (4-la) 

fxT£  =  °  r  (4-lb) 

£  +  f^e  +  f a&  =  0   .  (4-lc) 


Here  we  have  ignored  (as  Jefferys  did)  products  of  e  and  6 
with  Lagrangian  Multipliers.   This  does  not  affect  the  final 
result,  as  Jefferys  also  pointed  out.   A  caret  in  equations 
(4-1)  above  means  evaluation  at  current  values  of  x  and  a. 

Similar  to  Jefferys'  procedure,  we  solve  the  equations 
(4-1)  as  follows. 

Solving  Eq.  (4-la)  for  e  we  have 

e  =  -  v  -  of^y.   .  (4-2) 

Substituting  (4-2)  into  (4-lc)  for  e,  Eq.  (4-lc)  becomes 

f  -  f xv  -  fxafxTU  +  faS  =  0  .  (4-3) 
Solving  Eq.  (4-3)  for  ji,  we  obtain 
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U  =  W(f  -  f£V  +  f^S)   ,  (4-4) 

where  the  "weight  matrix"  w  is  given  by 

w  =  (fjof^T)-!   .  (4_5) 

Inserting  this  solution  for  \L   into  Eg.  (4-lfc),  we  have 

faTw(£  -  f^v  +  f^6)  =  0   .  (4-6) 

If  we  now  define 

$  =   £   -   f jjv      ,  (4-7) 

and  rearrange  Eg.  (4-6),  the  eguation  for  6  will  have  the 
form 

(f£Twf£)6  =  -  faTw$   .  (4-8) 

This  set  of  linear  eguations  is  easy  to  solve  for  the 
corrections  S  by  general  methods.   With  this  solution  for  6, 
the  improved  residuals  vn  are  obtained  from  the  eguation 

*h  "  "  afxTw($+fg6)   ,  (4-9) 
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which  follows  from  Egs.  (4-la),  (4-4)  and  (4-7).   We  then 
get  the  new  vectors  of  an  and  xn  as 

an  =  a  +  6   ,  (4-10a) 

xn  =  x0  +  vn  ,  (4-10b) 

which  constitute  the  improved  solution. 

After  each  iteration,  we  check  the  relative  magnitude 
of  each  component  in  v  and  6  against  the  corresponding 
component  in  x  and  a  and  get  the  maximum  value  among  all  of 
these  and  test  if  this  value  is  smaller  than  some  specified 
number,  say  10"8.   If  the  improved  solution  is  still  not 
sufficiently  accurate  (i.e.  if  the  above  found  maximum  value 
is  still  not  smaller  than  the  specified  value),  the  process 
of  iterations  has  to  be  continued  until  convergence  has  been 
attained,  that  is,  until  subseguent  iterations  no  longer 
give  significant  corrections. 

At  the  outset,  the  obvious  starting  point  for  this 
scheme  is  to  adopt  x=x0  as  the  initial  approximation  for  the 
"true  observation"  vector  x  (in  this  case,  v=0),  and  to  use 
as  first  approximation  of  a  for  a  a  vector  a0,  an  initial 
solution  of  the  seven  orbital  parameters,  which  has  been 
obtained  somehow. 

It  is  important  for  convergence  that  the  initial 
solution  of  (x,a)  is  not  too  different  from  the  final 
solution  which  is  obtained  by  the  process  given  by  Eg.  (4-1) 
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through  (4-10).   In  Chapter  III,  we  discussed  how  to  find  a 
good  approximation  as  an  initial  solution. 

According  to  the  scheme  outlined  above,  the  application 
of  Newton's  method  in  our  problem  would  consist  of  the 
following  steps. 
Step  1. 

Calculate  f,  f^  and  fg  from  the  current  values  of  x 
and  a   ; 
Step  2. 

Calculate  $  from  $  =  f  -  f^v  ; 
Step  3. 

Calculate  the  "weight  matrix"  w  from  w  =  (fxO-fxT)_1  '> 
Step  4. 

Solve  the  corrections  to  the  parameters,  6,  from  the 
linear  equations  (fgTwf^)6  =  -  fgTw$   ; 
Step  5. 


Calculate  the  improved  residuals  vn  from 
vn  =  -  afxTw($+fgS)   ; 
Step  6. 

Find  the  new  approximate  solution  from 
an  =  a  +  6  , 

*n  =  x0  +  vn   ; 
Step  7. 

Test  the  relative  magnitude  of  each  component  of  6  and 
v  against  a  and  x,  and  decide  if  a  further  iteration  is 
needed,  in  which  case  all  the  steps  above  must  be  repeated. 
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In  detail,  the  steps  are  as  follows. 
Step  1. 

Calculate  the  vector  of  functions  in  condition  equa- 
tions f,  and  the  vectors  of  partial  derivatives  fx  and  fg 
from  the  current  values  of  x  and  a,  where  x=Xq+v. 

The  dimension  of  the  vectors  xq,  v,  x  all  are  2m,  m 
being  the  number  of  observed  positions. 
The  observation  vector  is 

xo  =  (x10,YiO/-- wXi0,yi0,. . .,xm0,ym0)T  .  (4-11) 
The  current  vector  of  corrections  to  observations  is 

v=  (vxl,vyl,. . .,vxi,vyi, . . . ,vxm,vym)T   .        (4-12) 

From  Xq  and  v,  x  can  be  easily  found  by  x=Xg+v, 

x  =  (x10+vxl,y10+vvl,...,xi0+vxi,yi0+vy,... 

•••'*raO+vxm'YmO+vYm>T  •  (4-13) 

The  current  approximation  for  the  seven  parameters,  a,  is 

a  =  (n,T,a,e,i,w,fi)T   ,  (4-14) 


at  current  values.   Insert  the  known  (x,a)  into  the 
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condition  equations  to  get  the  function  2m-vector  f.   It  has 
a  form 

£=  [fii,f2l»-« •'fii,f2i'---'flm'f2m]T   »   (4_15) 
where 


f   =  —  +  e  -  cosEi    ,  (4-16a) 

a 


f_.  =  -i  -  sinE.    ,  (4-16b) 

21    b 


with 


and 


b  =  ajl^      f  (4-16c) 


eY, 
E.  =  n(t.-T)  +  — -  (4-16d) 

11       b 


The  coordinates  X,Y  are  calculated  from  Eqs.  (3-9), 
A,B,F,G,M  from  Eqs.  (3-8). 

The  first  derivatives  f£,  fg   are  calculated  at  current 
values  of  x  and  a. 

The  partial  derivatives  of  £  with  respect  to  the 
observations  x,  f^,    is  a  block-diagonal  2mx2m  square  matrix 
of  the  form 
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»£u   3fn 


fx  = 


9x, 


3x. 


3Yi 


3f21   9f 


*Yi 


9fli   9fli 


3Xj 


8x. 


3y. 


3f2i   3f2i 


^i 


9flm  8flm 
9xm   9ym 


9f2m   9f2m 


3x 


m 


3y 


m 


i.e. , 


(4-17) 


with 


fx  =  diag(gi,g2,  —  ,gj_,...,gm) 


(4-18) 


3x, 


g<  = 


3f 


2i 


3x, 


9*i 


3f 


2i 


sy. 


i=l  ,m 


(4-19) 


From  Eqs.  (4-16),  (3-8)  and  (3-9),  we  obtain 
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1 
a 


<3i    = 


-sinE. 
b    x 


— (1-ecosE. ) 
b        x 


(    3X. 


3x, 


3Y: 


3x, 


ax,  1 

i 

3y. 


3Y. 


*i 


(4-20) 


In  particular,  we  have 


3X: 


3X 


3xi 

3x 

M 

!!i  = 

3Y 

B 

=  —  = 

-  — 

3x. 

3x 

H 

3X. 

l 

3X      F 

=  —  =  -  _ 

3Yi 

3y     M 

9Yi 

3Y    A 

=  —  =  _ 

3Yi 

3y   M 

(4-21) 


and  therefore 


1 

e 

"  G 

F 

— 

— sinE. 

— 

—  — 

a 

b    x 

M 

M 

gi = 

1 

B 

A 

0 

—  ( 1-ecosE. ) 

-  — 

— 

b        x 

M 

M 

(4-22) 


The  expressions  for  all  the  partial  derivatives  in  f£  are 
listed  in  Table  4-1. 

The  dimension  of  f^,  the  vector  of  the  partial  deriva- 
tives of  f  with  respect  to  the  seven  parameters,  is  2mx7 . 
It  has  the  form 


fa  ~  (G1'G2 ' • • • 'Gi> • • • 'Gm)    » 


(4-23) 
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Table  4-1. 
Expressions  for  All  the  Partial  Derivatives  in  f£ 


:li 


f2i 


3 

1 

fG 

B                    ^ 

— 

—   - 

-  —  esinE. 

3x 

M 

u 

b             lJ 

3 

1 

rF 

A                  ^ 

3y 

M 

u 

b              X) 

1  B 

( 1-ecosE. ) 

Mb        x 


1  A 

—  —  ( 1-ecosE. ) 

Mb        x 


where 
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Gi   = 


•  3fu    3fljL    afu    afu    3£l.     9f]Li    afn  ' 

3n    3T    3a    3e    3i    3w    3Q 


3f2i   3f2.   3f2.   3f2i   3f2i   3f2i   3f2i 


^   3n    3T    3a    3e    3i    3u    3fi  J      . 
The  expressions  for  all  the  elements  in  Gj_  are  listed  below. 


<  9£u    3fu  ^ 


3n 


3n 


3T 


3f2i    3f2i 


3T 


sinE. 


"COSE; 


^ 


(  V*      -n  ] 


(4-24a) 


9fli     Xi    eYi  .  p 
=  -—=■-  sinE. 

3a      a    ab 


(4-24b) 


9£2i 
3a 


— (1-ecosE. ) 
ab        1 


(4-24c) 


3f 


li 


3e 


Y. sinE. 
b(l-e2) 


+  1 


(4-24d) 


3f9.    e  -  cosE. 

2~~  i 


3e 


b(l-e") 


(4-24e) 
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f    3f-  .  3f .  .  3f ,  .  "V 
li    ll    li 

3i    3u    3Q 


3f-.  3f0.  3f-. 
2i  __2i    2i 

3i   3u   30 


r  i 

a 
0 


e 

— sinE, 

b 


"N  (    3X.  3X.  3X.   1 

1  l  1 

3i   3u   3Q 


— ( 1-ecosE. ) 
b        1 


dYL    dY^    dYL 
^    3i   3w   3Q 


(4-24f ) 


From  Egs.  (3-8)  and  (3-9)  we  can  find  the  expressions  for 
the  following  six  partial  derivatives. 


3X 

M  =  z  .  sinco 

3i     X 


M 


3Y, 


3i 


=  Z-COSO)   ; 


3X.  3Y, 

M  — -  =  Ayi  -  Bx±      ,    M  — =  =  Fy^^  -  Gx^ 
3d)  3o) 


3X,  3Y, 

M  — i  =  Gy.  +  Fx.   ,    M  — =  =  -  By.  -  Ax. 
3C2       1      X         3Q         1      1 


(4-24g) 


In  terms  of  these  derivatives,  we  have 


rafn  3fu  3fli^i  r 


3i    3oo    3fi 


3f2i  3f2i  3f2i 
k  3i    3o)    3Q  j 


—  — smE. 
Ma  Mb 

1    e 
0   —  -  — cosE. 
Mb   Mb 


z.sinoj  Ay.-Bx.   Gy.+Fx. 


z.coso)  Fy.-Gx.  -By.-Ax. 


(4-24h) 


All  expressions  in  fz   are  listed  in  Table  4-2 
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Table  4-2. 


Expressions  for  All  the  Partial  Derivatives  in  fs 

a. 


3 

__ 

3n 

3 

3T 

3 

3a 

3 

3e 

*li 


(ti-T)sinE. 


-nsinE, 


Xi       Yi        •    . 
—5-  -  —  esmE. 

bT        ab  1 


•  J-  sinE, 


:2i 


-(ti-T)cosE. 


ncosE. 


Yi 

-  —   (1-ecosE. ) 


ab 


b(l-e') 


b(l-e') 


—sr-   (e-cosE.  ) 


3i 


M 


sinw       e 


+  —  sinE.coso) 

a  b  1 


—  (1-ecosE. )z.cosu 
Mb  1      1 


3        1 
3w     M 


Ayi"Bxi        e 

+  -sinE. (Fy. -Gx. ) 

a  b  1        1        1 


( 1-ecosE . ) ( Fy, -Gx . ) 

Mb  ill 


3 
3Q 

1 
M 

Gyi+Fxi 

a 

e 

-  -sinE. (By. +Ax. ) 
b  1        1        x 


— ( ecosE . -1 ) ( By . +Ax . ) 

Mb  x  *        x 
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Step  2 . 

Calculate  <j>  from  £,  v  and  f£  by  $=f-f£v.   The  dimension 
of  the  vector  $  is  also  2m.   It  has  the  form 


-  \  T 

<j>  =    (<J>i/<J>2>  •  •  • /^i' •  •  ■ '^m'         ' 


(4-25) 


where 


♦i 


li 


3xi 


XI 


3y. 


v 


yi 


Sf2i       3f2i 

:2i  -  r~  vxi  -  z—  V 


3xi 


3Y4 


v 


*i2 


(4-25') 


Step  3 . 

Calculate  the  weight  matrix  w  from  f^  and  a. 

The  matrix  f^  has  been  calculated  in  step  1.   The 
covariance  matrix  a  is  assumed  to  be  known.   The  dimension 
of  a   is  2mx2m. 

An  example  for  computing  a  is  shown  below. 

The  relationship  between  the  covariance  matrix  of 
rectangular  coordinates  (x,  y)  and  that  of  polar  coordinates 
(p,  9)  is 


xy 


3(x,y) 
3(p,6) 


pe 


8(x,y) 

3(p,e) 


-iT 


(4-26) 


where  x=pcos6,  y=psin6  and 
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3(x,y)   f   cose   -psine  ^ 
9(p,6)    ^  sine    pcose  j 


(4-26' ) 


Thus,  we  have 


xy 


r   cose    -psine 
k  sine    pcose   , 


I  o 


0  ^ 


(   cose 


sine  "^ 


k  -psine   pcose  j 


and  therefore 


xy 


2       2    2 

a  cos  6  +  aQp  sin  6 


^  (a  -  aflp  )cos6sine 


(a  -  aQp  )cos6sin6  ^ 

P      a 

2       2    2 

a  sin  6  +  aQp  cos"0  . 
p         0 


(4-26") 


In  these  expressions,  ap=A2p,   aQ=A20   and  Ap,   A0  are  the 
observational  errors  in  p  and  0,  respectively.   For  the 
observations,  the  random  errors  are  of  similar  order  of 
magnitude  as  the  systematic  ones,  larger  in  separation  than 
in  position  angle.   The  average  errors  pA0  and  Ap  vary 
somewhat  with  the  separation  p  and  can  be  assumed,  for  many 
series  of  observations,  to  follow  the  form  Cp1/3,  where  C 
varies  with  different  observers.   For  a  single  good  observa- 
tion C  will  not  exceed  0V03  in  position  angle  (pA0)  and 
0V08  in  separation  (Ap)  (Heintz,  1971).   Errors  will  be 
somewhat  larger  and  difficult  to  measure  if  one  or  both 
components  are  faint.   If  the  errors  are  expressed  in  the 
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dimensionless  (relative)  forms  A9  and  Ap/p,  it  is  seen  that 
they  increase  as  pairs  become  closer. 

Based  on  the  considerations  above,  we  can,  for  example, 
put  Ap=0V08p1/3  and  pA6=0l.,03p1/3 ,  i.e.  A8=0'.'03p"2/3 .   If  we 
allow  each  pair  of  numbers  (x^y-^)  in  an  observation  to  be 
correlated,  but  no  correlations  between  different  observa- 
tions, a,  would  be  block-diagonal.   In  this  case,  we  have 

a  =  diag( a^,a2/ ••• /O^, ... ,am)   , 


where 


ai  = 


axixi   axiyi 


v.  ayixi   CTyiyi 


'  ail    ai3 


ai4   ai2 


with  o"xiyi=ayixi'  i.e.,  ai3-ai4   . 

The  form  of  the  weight  matrix  is  simpler  in  this  case; 
it  is  also  block-diagonal. 

According  to  Eg.  (4-5),  w= ( f xaf J) " 1 .  The  matrices  fx, 
a,  fxT  now  are  all  block-diagonal.   Therefore 


w  =  diag(wlfW2, . . • ,w^, . . . ,wm)   , 


(4-29) 


with 
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wi  = 


wil 
.  wi4 


wi3  > 
wi2  > 


(4-30) 


.-1 


The  computation  of  w  is  straightforward.   We  first  find  w 
If  we  denote  \i=w~^-=f^pf^Ir    it  is  obvious  that  u  has  the  same 
form 


u  =  diag(u1/u2,...,ui,...,um) 


(4-31) 


where 


ui  = 


'  uil    ui3  " 
<  ui4    ui2  j 


3x, 


3f 


2i 


3x. 


»fu1 

r 

3*i 

sf2i 

*i  J 

v. 

'il 


'14 


ai3 


a.  _ 
i2 


"  3fii 

3X; 


df 


li 


3y. 


3fo-  ^ 
2i 

3x. 

l 

3f2i 


3y,- 


y  * 


i.e. 


(4-32) 


u 


il 


raf.-r      3f . .  3f.. 

_J+     a       +  2   — ii— ii  a.,  + 
Ux±  J   1X      3X;L   3Yi 


'12   ' 


(4-33a) 
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u 


i2 


2i 
3x. 


3f-.  3f-. 
-.    2i    2i 

aii  +  2 : —  : —  ai3  + 

dx±      3yi 


2i 


ai2   ;    (4-33b) 


u 


i3 


3f - •  3f-. 
li    2i 

3x.   3x. 


a...  + 
ll 


Chi,.    3f-.    3f.  .  3f-41 
li    2i      li    2i 

^3x.   3y.     3y.   3x. 


ai3  + 


3f-  .  3f«. 
li    2i 

3Yi   3Yi 


ai2   ; 


(4-33c) 


ui4  =  ui3 


(4-33d) 


After  computing  u,  we  can  find  its  inverse  w  very  easily. 
If  we  denote  u=^iiUi2~ui3ui4:=uilui2"ui3  *    we  nave 


w.  = 


wil 


w 


i3 


Wi4 


w 


.12  J 


Ui2 

Ui4 

u 

u 

Ui3 

uil 

u 


u 


(4-34) 


As  seen  above,  we  can  calculate  the  components  of  w  directly 
from  the  components  of  u. 

If  the  quantities  which  constitute  different  observa- 
tions were  also  correlated,  we  would  have  to  find  the 
product  of  the  matrices  f£,  a,  f£T  and  calculate  its  inverse 
to  get  w.   This  would  be  more  complicated  by  far. 
Step  4. 
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Find  f5Twfa  and  -fsT$  from  fs,  $  and  w,  then  solve  the 


•a  Bia 
linear  equations 


(f5Twf5)s  =  f:Tw$ 


to  get  the  correction  vector  6. 

The  dimension  of  the  matrix  faTwfa  ^s  ^x7  an^  ~faTw*'  ® 
are  7-vectors.   Denoting  A=f^Twfg  and  B=fgTw$,  we  have 


^  r  «u  m1±         af21  »fu 

A{j,k)  =  > wi;L  + wi3 


i<L 


v 


9aj   3ak 


3a.   3ak 


3f - ■  3f~.  3f,.  3f-. 

li    2i  „     2i  2i 

+  w.  ~.  +  w.  - 

i3  ^_    ^_     12 


3aj   3ak 


9aj   9ak 


and 


j=l,7; 

k=l,7;   (4-35a) 


m 


(    3f 


B(j)  =  T 


2=1 


li 


3f 


3a. 


Wil  *il  + 


2i 


3a. 


w.  -,  <i>  .  *    + 
i3  yil 


8fli     *     3f2i 
+  ~—  w.3  $i2  + w±2  <j»i2 


3Sj 


3a. 
J 


j=l,7;   (4-35b) 


where 
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!fii„!£ii      3fii  _  3fii      !£ii  =  !£ii      afu  _  3fu 

da-  3n  ,     8a2     3T  ,     3a3    5a   ,     3a4     3e  , 


. . . ,  and  so  forth. 

To  solve  the  linear  equations  A6=-B  for  6,  we  can  apply  any 
linear  equation  solution  method,  e.g.  Gaussian  Elimination, 
Gauss-Jordan  Elimination,  the  Jacobi  Iterative  method,  etc. 
In  a  private  communication,  Eichhorn  (1985)  proposed  a 
special  solution  method  for  this  particular  problem,  because 
the  covariance  matrix,  i.e.,  the  inverse  of  the  coefficient 
matrix  of  6,  is  fortunately  a  positive  definite  matrix.   The 
method  will  be  described  in  another  section  of  this  chapter. 
In  our  work,  we  use  this  special  method  for  solving  the 
linear  equations  above  for  5. 
Step  5. 

Calculate  the  new  vector  vn,  which  now  is  the  improved 
correction  vector  to  the  observations,  from 

^n  =  '  o-fxTw($+fg6)   • 

The  dimension  of  the  vector  v  is  2m.   Denote  Q=$fgS  and 
R=f£TwQ,  Q  and  R  both  are  also  2m-vectors.   We  can  see  that 

Q  =  (Qi,Q2/---/Qi,--.,Qm)T   /  (4-36) 


where 
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Q<  = 


3f .  . 
f    - 
11    3x, ^  Vxi 


3f 


li 


7    3f 


v 


,  +z 


li 


3yi  Y1   a=T  3a. 


3f 


2i 


3f 


v 


'2i    3Xi 


3y 


2i        7    3f2i 
—  v  .  +  >    6  . 

Y1   fo  3a.    3 


i.e. , 


Q;  = 


7    3f 


:ii+l^ 


li 


i^T  3a. 
J 


6. 


7    3f 


f2i+2 


2i 


i=l   3a. 


Oil 


'12 


(4-36' ) 


In  the  case  of  w  being  block-diagonal, 


R  —  ( R^ ,  R2  /  • «  •  /  R^  /  •  •  •  /  R^ )    / 


(4-37) 


where 


Ri  = 


(   Rn  ^ 


I  Ri2 
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3f 


3x. 

l 


U  (2ilWil+2i2Wi3)  +  77^  (QilWi4+2i2Wi2) 


3x. 


U  (2ilWil+2i2Wi3)  +  T^   (2ilWi4+2i2Wi2) 


3y 


Therefore,  in  this  case, 


3yi 


(4-37') 


vn  =  (V1#  V2,  ...,  V±,  ...,  Vm)T   , 


(4-38) 


where 


V.  = 

l 


(   v  . 

XI 


V  . 


(   anRn  +  a.3Ri2 


a. ,R. ,  +  a. -R. _ 
^   i4  ll     i2  x2 


(4-38' ) 


Step  6. 

Find  the  new  approximate  solutions  an  and  xn  from 


an  =  a  +  6 


xn  =  x0  +  vn 


Step  7. 

Determine  if  another  iteration  is  needed. 
Calculate  all  quantities  of 


Xi(new)  ~  xi(old) 


x 


i(old) 


rj(new)  ~  yi(old) 
yi(old) 


and 
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(altogether  2m+7  quantities)  and  pick  up  the  maximum  value 
among  them.   If  this  maximum  exceeds  a  pre-set  specified 
value,  say  10"^ ,  return  to  the  very  beginning  and  take  all 
of  the  steps  once  again.   Only  when  the  maximum  is  less  than 
the  specified  value,  we  assume  that  good  convergence  has 
been  achieved. 

Above  is  the  scheme  of  Newton's  method  in  our  orbit 
problem.   In  real  programming,  some  additional  considera- 
tions have  been  taken  into  account.   For  example,  in  the 
actual  program,  all  variables  are  dimensionless.   For 
rectangular  coordinate  pairs  (x,  y) ,  two  new  quantities  are 
defined, 


x         y 

x'  =  —    y»  =  —  (4-39) 

x0   '       Y0   ' 


where  (xq,  yg)  are  the  "observations"  in  the  initial  data, 
so  that  x  =  xqx' ,  y  =  ygy' .   For  the  seven  parameters,  seven 
new  variables  are  defined  as  well.   They  are 

n  T  a  e  i 


al  =  — 

nQ  , 

a2 

T0  ' 

a3  =  ~ 

a0  ' 

a4 

e0  ' 

a5 

L0    ' 

0) 

Q 

* 

a6  =  — 

w0  ' 

a7 

fi0 

i 

(4-3 

where  (no,TQ,aQ,eQ,io,u)g  ,Qq  )  in  ag  is  the  initial 
approximate  solution  and  e  is  defined  as  e=sinc})  and  eg=sin<})Q 
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In  terms  of  these  new  variables,  all  formulae  for 
computing  the  partial  derivatives  need  only  slight  modifica- 
tions, and  the  whole  process  remains  in  principle  unchanged. 

In  addition,  we  have  seen  that  f£,  a,  f£T  are  sguare 
2mx2m  matrices.   If  the  number  of  observations  is,  e.g. 
m=100,  the  covariance  matrix  a  has  40,000  components,  and 
these  three  matrices  alone  occupy  a  huge  storage,  120,000, 
in  the  computer.   In  practical  programming,  we  should  keep 
the  reguired  computer  storage  to  a  possible  minimum.   A 
block-diagonal  covariance  matrix  a  can  be  stored  in  a  2mx2 
matrix,  and  the  same  applies  to  all  other  related  matrices. 
This  greatly  reduces  the  need  for  storage. 

Weighting  of  Observations 
As  we  know,  the  measures  of  any  binary  star  will  be 
affected  by  the  accidental  and  systematic  observational 
errors  and,  occasionally,  from  blunders,  i.e.,  actual 
mistakes.   The  points,  when  plotted,  will  not  lie  exactly  on 
an  ellipse  but  will  occupy  a  region  which  is  clustered 
around  an  ellipse  only  in  a  general  way. 

Observations  are  freguently  combined  into  normal 
places.   Among  the  observed  guantities,  the  time  is  the  one 
observed  most  precisely.   To  investigate  the  measurements 
for  discordance  before  using  them  to  calculate  an  orbit,  the 
simplest  method  is  to  plot  the  distance  p  in  seconds  of  arc 
and  the  position  angles  6,  separately,  as  ordinates,  against 
the  times  of  observation  as  abscissae.   Smooth  curves  are 
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drawn  to  represent  the  general  run  of  the  measures  and  in 
drawing  these  curves,  more  consideration  will  naturally  be 
given  to  those  observation  points  which  are  relatively  more 
precise  (for  example,  a  point  based  upon  several  well 
agreeing  measures  by  a  skilled  observer  and  supported  by  the 
preceding  and  following  observations)  than  to  the  others. 
The  deviation  of  the  observation  points  is  in  the  ordinate 
direction  only  and  gives  a  good  idea  of  the  accuracy  of  both 
observed  guantities.   The  curves  will  show  whether  or  not 
the  measures  as  a  whole  are  sufficiently  good  to  warrant  the 
determination  of  a  reasonably  reliable  orbit.   Observations 
which  are  seriously  in  error  will  be  clearly  revealed  and 
should  be  rejected  or  given  very  low  weights.   The  curves 
will  also  give  a  general  idea  of  how  the  observations  are  to 
be  weighted.   The  points  which  show  larger  deviations  should 
be  given  lower  weights  and  the  well-determined  points  should 
be  given  higher  weights. 

It  is  hard  to  recommend  a  general  rule  for  the 
weighting  of  measurements  and  normal  positions.   Some 
precept  could  be  considered  (W.  D.  Heintz,  1971):   compute  a 
weight  p^  according  to  the  number  of  observations,  and  P2 
according  to  the  "weight"  assigned  to  the  observers,  then 


the  normal  place  receives  the  weight  p7piP2  ( if  computations 
are  made  in  the  guantities  de  and  dp/p).  This  precept  could 
avoid  unduly  high  weights  for  single  measurements  as  well  as 
for  very  many  observations  by  few  observers,  and  would 
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reduce  the  influence  of  residual  systematic  errors.   Its 
implicit  assumption  is  a  proportionality  of  the  errors  pde 
and  dp  with  Jp.      This  holds  better  for  the  multi-observer 
average,  as  the  share  by  less  accurate  data  usually 
increases  at  larger  separations. 

In  our  work,  we  given  the  points  on  or  nearly  on  the 
smooth  curves  egual  unit  weights,  and  assign  lower  weights 
to  those  farther  from  the  lines. 

When  we  take  into  account  the  weights  for  the  observa- 
tions, some  slight  and  very  simple  modification  is  needed  in 
the  solution  process  described  above. 

Let  G  be  the  matrix  of  weights  for  the  observational 
errors  v.   The  dimension  of  G  is  2mx2m,  and  G  is  diagonal. 

Taking  into  account  G,  the  residual  function  S0  will  be 
modified  as 


1  i 

-T  T  -1  -      -T   -1  - 

Sn  =  -  v  Ga   Gv  =  -  vGo   Gv   ,  (4-40) 

U    2  2 


because  GT  =  G.   If  we  define  E-1  =  Ga_1G,  then 

1  -T  -1- 
Sn  =  -  v  E   v   ,  (4-40'  ) 

U    2 

which  has  the  exact  form  as  before  except  a  is  replaced  by 
E.   Therefore  we  need  only  to  replace  a  by  E,  a~l  by  E~l 
wherever  a  of  a~l  appears. 
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Because  G  is  diagonal,  the  calculations  of  E  and  E  ^ 
from  G  and  a  are  also  very  simple. 

The  Orthogonal  Set  of  Adjustment  Parameters 
and  the  Efficiency  of  a  Set  of  Orbital  Parameters 

As  mentioned  in  the  first  section  of  this  chapter, 
Eichhorn  (1985)  has  proposed  a  special  method  in  a  private 
communication  for  solving  the  linear  normal  eguations  A6=-B 
of  our  problem.   This  method  is  as  follows. 

According  to  its  definition,  A=f£Twfg  is  obviously  a 
symmetrical  positive  definite  matrix. 

Given  a  symmetrical  positive  definite  matrix  Q,  we  look 
for  its  eigenvectors  X  which  have  the  property  that 

QX  =  XX   .  (4-41) 

Any  value  X  which  satisfies  Eg.  (4-41)  is  an  eigenvalue  of 
Q,  and  since  Q  is  positive  definite  we  know  that  all  X>0. 
Obviously  X  must  satisfy  the  eguation  | Q— IX |  =  0,  the 
"characteristic  eguation"  of  Q.   This  is  a  polynomial  in  X 
of  the  same  order  n  as  that  of  Q.   The  solutions  are 
X]_,  X2,...,Xn.   The  solution  X^  of  the  homogeneous  system 
(4-41)  with  X=XK  is  the  k-th  eigenvector.   Since  this  is 
determined  only  with  the  uncertainty  of  an  arbitrary  scale 
factor,  we  may  always  achieve  |XK|  =  1  for  all  k.   Let  the 
matrix  P=(X1,X2, . . . ,Xn)  be  the  matrix  of  the  normalized 
eigenvectors  of  Q.   It  can  be  shown  that  P  is  therefore 
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orthogonal,  so  that  PT=p~l.   Equation  (4-41)  can  be  written 
as 

QP  =  Pdiag(X1,  ...,  Xn)   .  (4-42) 

Writing  the  diagonal  matrix  diag(X]_,  X2,  •..,  Xn)=D,  we 
rewrite  Eq.  (4-42)  as 

QP  =  PD   , 

whence 

PTQP  =  D   ,      Q  =  PDPT   ,  (4-43) 


and 


PTQ_1P  =  D"1   ,    Q"1  =  PD-1PT   .  (4-44) 

The  equation  (4-44)  shows  incidentally  that  the  eigenvectors 
of  a  matrix  are  identical  to  those  of  its  inverse,  but  that 
the  eigenvalues  of  a  matrix  are  the  inverses  of  the  eigen- 
values of  the  inverse. 

Let  Q  be  the  covariance  matrix  of  a  set  of  statistical 
variates  X.   We  look  for  another  set  Y  as  function  of  X, 
such  that  the  covariance  matrix  of  Y  is  diagonal. 
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Putting  dX=a,  the  quadratic  form  which  is  minimized  is 
aTQ~^a,  where  Q  is  the  covariance  matrix  of  X.   If  we  define 

Y  =  PTX   ,  (4-45) 


then  we  have 


X  =  PY 


and 


dX  =  PdY   or   a  =  Pp  ,    aT  =  pTPT   . 

and  in  terms  of  P  =  dY,  the  quadratic  form  minimized  which 
led  to  the  values  X  becomes  PTPTQ~^-PP,  whence  the  covariance 
matrix  of  Y  is  seen  to  be  (PTQ_1P) _1=PTQP=D  by  Eq.  (4-44). 

It  is  then  shown  that  Y=PTX  is  the  vector  of  linear 
transforms  of  X  whose  components  are  uncorrelated.   A  vector 
of  such  components  is  called  a  vector  of  statistical 
variates  with  efficiency  1  (Eichhorn  and  Cole,  1985).   If 
any  of  its  components  are  changed  (e.g.  in  the  process  of 
finding  their  values  in  a  course  of  iteration) ,  none  of  the 
other  components  of  Y  would  thereby  be  changed  in  a  tradeoff 
between  changes  of  correlated  unknowns. 


Now,  back  to  the  normal  equations  A6=-B.   The  problem 
of  solving  normal  equations  above  can  therefore  be  attacked 
as  follows. 

1)  Find  the  eigenvalues  and  eigenvectors  of  A,  i.e.,  D-1  and 
P.   (A=Q-1,  according  to  the  notation  above.) 

2)  The  covariance  matrix  of  6,  Q,  can  then  be  calculated 
from  Eq.  (4-43),  Q=A_1=PDPT. 

3)  Let  S'=PTS.   This  gives  APS'=-B,  whence  S'=-PTQB,  or  from 
Eq.  (4-43), 

6'  =  -  DPTB   .  (4-46) 

Finding  D_1  from  D  is  very  easy  because  D  is  diagonal.   The 
vector  6  is  then  easily  calculated  directly  from  6=PS' . 

One  of  the  advantages  of  this  method  is  that  we  can 
calculate  both  5  and  6',  the  correlated  and  uncorrelated 
elements  of  the  corrections  to  the  adjustment  parameters,  at 
the  same  time.   The  vector  6'  is  the  set  of  the  corrections 
to  the  orthogonal  adjustment  parameters. 

Furthermore,  using  this  method,  a  measure  for  the 
"efficiency"  of  a  set  of  adjustment  parameters  can  be  easily 
calculated,  cf.  Eichhorn  and  Cole  (1985).   They  point  out 
that  the  information  carried  by  the  vector  6  of  the 
correlated  estimates  (whose  covariance  matrix  is  Q)  is 
partly  redundant  because  of  the  nonvanishing  correlations 
between  the  estimates.   What  is  the  information  content 
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carried  by  these  estimates?  We  see  that  the  matrix  PTQP  is 
diagonal  if  P  is  the  (orthogonal)  matrix  of  the  normalized 
eigenvectors  of  Q  and  that  it  is  also  the  (therefore 
uncorrelated)  linear  transforms  6'=PS  of  5.   We  might  regard 
the  number 


I  Q  I 

e  =  (4-47) 


^11   ^nn 


/ 


that  is,  the  product  of  the  variances  of  the  components  of 
vector  6'  (the  product  of  the  variances  of  the  uncorrelated 
parameters)  divided  by  the  product  of  the  variances  of  the 
components  of  vector  6  ( the  product  of  the  variances  of  the 
correlated  parameters)  as  a  measure  for  the  "efficiency"  of 
the  information  carried  by  the  estimates  6.   But  we  note 
that  according  to  the  definition  (4-47),  the  value  of  e  is 
severely  affected  by  the  number  of  variables.   In  order  to 
eliminate  the  effect  of  n,  we  redefine  e  as 


e  = 


"\ 


^11   ^nn  y 


1 
n 

(4-47') 


For  any  set  of  uncorrelated  estimates  we  would  have  e=l, 
which  is  evidently  the  largest  value  this  number  may  assume. 
In  our  work,  for  every  model  calculated,  the  efficiency 
of  the  set  of  adjustment  parameters  and  their  covariance 
matrix  are  calculated. 


70 

A  Practical  Example 

Table  4-3  lists  the  observation  data  for  51  Tau.   These 
data  are  provided  by  H.  A.  Macalister.   The  author  would 
like  to  thank  Macalister  and  Heintz  for  their  data  which  he 
has  used  in  this  dissertation. 

Theoretically,  the  first  step  in  our  computation  should 
be  the  reduction  of  the  measured  coordinates  to  a  common 
epoch  by  the  application  to  the  position  angles  of  correc- 
tions for  precession  and  for  the  proper  motion  of  the 
system.   The  distance  measures  need  no  corrections. 
Practically,  both  corrections  are  negligibly  small  unless 
the  star  is  near  the  Pole,  its  proper  motion  unusually 
large,  and  the  time  covered  by  the  observations  long.   The 
precession  correction,  when  required,  can  be  found  with 
sufficient  accuracy  from  the  approximate  formula 

A8  =  8  -  60  =  +0?00557sinasec6(t-t0)   ,      (4-48) 

which  is  derived  by  differentiating  Eq.  (3-1)  with  respect 
to  6,  and  introducing  the  precessional  change  A(ncosa)  for 
d6.   The  position  angles  are  thus  reduced  to  a  common 
equinox  t0  (for  which  2000.0  is  currently  used),  and  the 
resulting  node  Qq  also  refers  to  tg,  because 
A8=8-9o=Q-Qo=AQ.   Computing  ephemerids,  the  equinox  is 
reduced  back  from  tg  to  t. 

The  change  of  position  angle  by  proper  motion, 
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6  -  90  =  y.asina(t-t0)   ,  (4-49) 

where  the  proper  motion  component  in  right  ascension  u^  is 
in  degrees,  can  be  neglected  in  most  cases. 

The  two  formulae  above  can  be  found  either  in  Heintz ' 
book  "Double  Stars"  or  in  Aitken's  book  "The  Binary  Stars". 

In  table  4-4,  all  position  angles  have  been  reduced  to 
the  common  eguinox  2000.0  and  the  converted  rectangular 
coordinates  (xq,yq)    are  also  listed. 

In  figure  4-1,  all  pairs  of  rectangular  coordinates 
(xq,Yo)  are  plotted  in  the  xg-yo  plane.   We  can  see  at  a 
glance  that  all  observation  points  are  distributed  closely 
to  an  ellipse.   Furthermore,  in  Figures  2a  and  2b,  we  plot 
the  distance  p  in  seconds  of  arc  and  the  position  angles 
against  the  observing  epoch,  respectively.   From  Figure  2, 
we  see  that  the  observations  fall  upon  nearly  sine-like 
curves.   Thus,  we  get  the  impression  that  these  data  are 
rather  precise  and  therefore  give  all  observations  equal 
weights. 

For  these  data,  an  initial  approximate  solution  is 
easily  obtained  by  Kowalsky's  method.   The  initial 
approximation  ag  is  listed  in  Table  4-5. 

Starting  from  this,  the  final  solution  for  a  is 
obtained  after  only  three  iterations  and  shown  in  Table  4-6. 
The  calculation  required  only  47?62  CPU  time  on  a  VAX. 
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The  residuals  (observational  errors)  in  (p,8)  and  (x,y) 
are  shown  in  Table  4-7.   The  residuals  in  (p,8)  and  (x,y) 
are  plotted  against  the  observing  epoch  in  Figures  4-3  and 
4-4,  respectively.   They  show  a  random  distribution  as 
expected.   Also,  Figure  4-5  shows  the  comparison  of  the 
observation  points  with  the  corresponding  points  after 
correction  in  the  apparent  plane. 

The  "efficiency,"  the  covariance  matrix,  the  correla- 
tion matrix  and  transformation  matrix  (which  transforms  the 
correlated  parameters  to  the  uncorrelated  ones)  of  the 
adjusted  parameters  in  the  final  solution  are  calculated  and 
listed  in  Table  4-6.   In  addition,  Table  4-6  also  lists  the 
standard  deviations  of  a)  the  original  and  b)  the 
uncorrelated  parameters  in  the  final  solution. 
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Table  4-3. 
The  Observation  Data  for  51  Tau. 


HR1331 

51  Tau 

hd  ; 

11116 

SAO 

76541 

04185+2135  n 

1975.7160 

106.0° 

2, 

.0 

0.080 

0.003 

Al 

1975.9591 

91.9 

1, 

.7 

0.074 

0.003 

Al 

1976.8574 

34.9 

1. 

.5 

0.069 

0.005 

A2 

1976.8602 

33.5 

1, 

.5 

0.073 

0.009 

A2 

1976.9229 

22.9 

1, 

.0 

0.072 

0.008 

A3 

1977.0868 

26.7 

1, 

.0 

0.083 

0.008 

A3 

1977.6398 

8.8 

0.101 

A6 

1977.7420 

3.1 

0. 

.8 

0.110 

0.008 

A5 

1978.1490 

352.2 

0. 

.5 

0.113 

0.010 

A5  n 

1978.6183 

340.7 

0. 

.8 

0.108 

0.008 

A5 

1978.8756 

333.3 

2. 

.0 

0.086 

0.013 

B4 

1978.7735 

304.3 

0.090 

A7 

1980.1532 

285.9 

0.075 

A8 

1980.7182 

259.0 

0.079 

A8 

1980.7263 

255.8 

0.085 

A8 

1980.7291 

259.1 

0.087 

A8 

1982.7550 

191.80 

0.1343 

C2 

1982.7579 

192.65 

0.1362 

C2 

1982.7605 
1982.7633 

190.36 
192.90 

0.1315 
0.1381 

C2 
C2 

1982.7661 

193.39 

0.1308 

C2 

1983.0472 

186.18 

0.1333 

C2 

1983.0637 

187.21 

0.1499 

C2 

1983.7108 

182.05 

0.1456 

C2 

1983.7135 

179.56 

0.1480 

C2 

1983.9337 

181.0 

1. 

.9 

0.149 

0.010 

FA 

1983.9579 

176.7 

0.157 

RB 

1984.0522 

175.01 

0.1446 

C2 

1984.0576 

174.79 

0.1445 

C2 

1984.0603 

172.73 

0.1355 

C2 

1984.779 

157.3 

0.146 

RC 

1984.9308 

164.5 

2, 

.6 

0.141 

0.013 

FB 

1985.1063 

161.0 

2. 

.7 

0.137 

0.013 

FB 

1985.2048 

158.0 

3. 

.0 

0.125 

0.013 

FB 

1985.8378 

145.69 

0.1141 

## 

1985.8406 

144.53 

0.1202 

## 

1985.8541 

145.71 

0.1200 

## 
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Table  4-4. 
The  Reduced  Initial  Data  for  51  Tau 


t 

60 

P0 

x0 

Y0 

1 

1975.7160 

106.00 

0.0800 

-0.0222 

0.0769 

2 

1975.9591 

91.90 

0.0740 

-0.0026 

0.0740 

3 

1976.8574 

34.90 

0.0690 

0.0565 

0.0396 

4 

1976.8602 

33.50 

0.0730 

0.0608 

0.0404 

5 

1976.9229 

32.90 

0.0720 

0.0604 

0.0392 

6 

1977.0868 

26.70 

0.0830 

0.0741 

0.0374 

7 

1977.6398 

8.80 

0.1010 

0.0998 

0.0156 

8 

1977.7420 

3.10 

0.1100 

0.1098 

0.0061 

9 

1978.1490 

352.20 

0.1130 

0.1120 

-0.0151 

10 

1978.6183 

340.70 

0.1080 

0.1020 

-0.0355 

11 

1978.8756 

333.30 

0.0860 

0.0769 

-0.0385 

12 

1979.7735 

304.30 

0.0900 

0.0508 

-0.0743 

13 

1980.1532 

285.90 

0.0750 

0.0207 

-0.0721 

14 

1980.7182 

259.00 

0.0790 

-0.0150 

-0.0776 

15 

1980.7263 

255.80 

0.0850 

-0.0207 

-0.0824 

16 

1980.7291 

259.10 

0.0870 

-0.0163 

-0.0855 

17 

1982.7550 

191.80 

0.1343 

-0.1314 

-0.0176 

18 

1982.7579 

192.65 

0.1362 

-0.1329 

-0.0300 

19 

1982.7605 

190.36 

0.1315 

-0.1293 

-0.0238 

20 

1982.7633 

192.90 

0.1381 

-0.1346 

-0.0310 

21 

1982.7661 

193.39 

0.1308 

-0.1272 

-0.0305 

22 

1983.0472 

186.18 

0.1333 

-0.1325 

-0.0145 

23 

1983.0637 

187.21 

0.1499 

-0.1487 

-0.0190 

24 

1983.7108 

182.05 

0.1456 

-0.1455 

-0.0054 

25 

1983.7135 

179.56 

0.1480 

-0.1480 

0.0010 

26 

1983.9337 

181.00 

0.1490 

-0.1490 

-0.0028 

27 

1983.9579 

176.70 

0.1570 

-0.1568 

0.0088 

28 

1984.0522 

175.01 

0.1446 

-0.1441 

0.0124 

29 

1984.0576 

174.79 

0.1445 

-0.1439 

0.0129 

30 

1984.0603 

172.73 

0.1355 

-0.1344 

0.0170 

31 

1984.7790 

157.30 

0.1460 

-0.1348 

0.0562 

32 

1984.9308 

164.50 

0.1410 

-0.1359 

0.0375 

33 

1985.1063 

161.00 

0.1370 

-0.1296 

0.0445 

34 

1985.2048 

158.00 

0.1250 

-0.1160 

0.0467 

35 

1985.8378 

145.69 

0.1141 

-0.0943 

0.0642 

36 

1985.8406 

144.53 

0.1202 

-0.0980 

0.0696 

37 

1985.8541 

145.71 

0.1200 

-0.0992 

0.0675 

75 
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Table  4-5. 
Initial  Approximate  Solution  ag  for  51  Tau, 


p0(yr) 

TO 

ii 

a0 

e0 

i0° 

o>0° 

Q0° 

11.18 

1966.4 

0.128 

0.181 

127.3 

152.9 

170.2 
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Table  4-7. 
Residuals  of  the  Observations  for  51  Tau  in  (p,6)  and  (x,y) 


t 

ve 

VP 

vx 

Vy 

1 

1975.7160 

1.0090 

-0.0042 

0.0000 

-0.0044 

2 

1975.9591 

1.1702 

-0.0039 

-0.0012 

-0.0039 

3 

1976.8574 

2.5044 

0.0082 

0.0048 

0.0073 

4 

1976.8602 

3.7620 

0.0042 

0.0007 

0.0064 

5 

1976.9229 

1.2477 

0.0070 

0.0050 

0.0051 

6 

1977.0868 

-0.0465 

0.0009 

0.0009 

0.0002 

7 

1977.6398 

-2.0624 

-0.0015 

-0.0010 

-0.0039 

8 

1977.7420 

0.5795 

-0.0083 

-0.0083 

0.0004 

9 

1978.1290 

0.2656 

-0.0058 

-0.0057 

0.0011 

10 

1978.6183 

-0.5497 

-0.0015 

-0.0018 

-0.0006 

11 

1978.8756 

-0.2196 

0.0173 

0.0152 

-0.0082 

12 

1979.7735 

-2.1815 

-0.0054 

-0.0059 

0.0027 

13 

1980.1632 

-1.2302 

0.0036 

-0.0008 

-0.0039 

14 

1980.7182 

-2.8268 

-0.0005 

-0.0038 

0.0014 

15 

1980.7263 

-0.0249 

-0.0064 

0.0014 

0.0063 

16 

1980.7291 

-3.4623 

-0.0084 

-0.0032 

0.0093 

17 

1982.7550 

1.9542 

-0.0022 

0.0031 

-0.0038 

18 

1982.7579 

1.0538 

-0.0041 

0.0045 

-0.0013 

19 

1982.7605 

3.2987 

0.0007 

0.0009 

-0.0074 

20 

1982.7633 

0.7102 

-0.0058 

0.0060 

-0.0001 

21 

1982.7661 

0.1717 

0.0015 

-0.0014 

-0.0006 

22 

1983.0472 

2.7391 

0.0052 

-0.0043 

-0.0069 

23 

1983.0637 

1.4488 

-0.0111 

0.0115 

-0.0019 

24 

1983.7108 

-2.8848 

0.0019 

-0.0020 

0.0075 

25 

1983.7135 

-0.4324 

-0.0005 

0.0005 

0.0013 

26 

1983.9337 

-4.9126 

-0.0004 

0.0007 

0.0129 

27 

1983.9579 

-0.9444 

-0.0083 

0.0085 

0.0022 

28 

1984.0522 

-0.5459 

0.0042 

-0.0040 

0.0020 

29 

1984.0576 

-0.3999 

0.0043 

-0.0042 

0.0016 

30 

1984.0603 

1.6232 

0.0133 

-0.0136 

-0.0023 

31 

1984.7790 

6.9626 

-0.0028 

-0.0031 

-0.0173 

32 

1984.9308 

-2.5201 

-0.0004 

0.0022 

0.0060 

33 

1985.1063 

-1.7784 

0.0000 

0.0015 

0.0042 

34 

1985.2048 

-0.3956 

0.0097 

-0.0086 

0.0046 

35 

1985.8378 

-0.2949 

0.0015 

-0.0008 

0.0014 

36 

1985.8406 

0.8016 

-0.0047 

0.0037 

-0.0039 

37 

1985.8541 

-0.6865 

-0.0050 

0.0050 

-0.0016 
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Remarks 

Newton's  method  would  converge  well  if  the  observa- 
tional errors  are  small  enough  and  the  initial  approximate 
solution  is  sufficiently  accurate.   But,  when  the  errors  in 
the  observed  distances  and  position  angles  are  not  suffi- 
ciently  small,  or  the  initial  approximation  is  not  close 
enough  to  the  final  solution,  two  things  will  happen: 

1)  In  some  iterations,  the  corrections  to  the  adjust- 
ment parameters  are  too  big  so  that  some  parameters  go  to 
unreasonable  values;  e.g.,  the  semi-major  axis  a  becomes 
negative  or  the  mean  motion  n,  the  eccentricity  e  becomes 
negative,  which  are  unacceptable;  and  further  calculation 
becomes  pointless. 

2)  The  iterations  do  not  converge;  i.e.,  the  residual 
function  S  becomes  larger  in  the  next  iteration,  although 
all  parameters  remain  in  the  ranges  of  reasonable  values. 

In  these  cases,  Newton's  method  will  fail  to  yield  a 
solution.   In  the  next  chapter,  two  other  approaches  (the 
modified  Newton  methods)  are  proposed  to  deal  with  these 
cases. 


• 


CHAPTER  V 
THE  MODIFIED  NEWTON  SCHEME 


The  scheme  for  solving  the  nonlinear  condition  equa- 
tions by  Newton's  method  has  been  discussed  in  the  last 
chapter.   Although  rapid  convergence  can  be  expected  if  the 
initial  approximate  solution  is  sufficiently  accurate  and 
the  residuals  are  small  enough,  Newton's  method  often  fails 
to  yield  a  solution,  particularly  if  the  initial  approxima- 
tion ag  of  the  vector  a  is  greatly  in  error  or  the  residuals 
are  very  large.   As  mentioned  in  the  last  chapter,  two 
problems  arise  in  these  cases: 

1)  Some  of  the  corrections  6  to  the  adjustment  parameters 
are  too  big,  which  is  unacceptable.   For  example,  eQ=0.15, 
but  6e=-0.16,  so  that  the  new  value  e=-0.01;  in  this  case, 
the  residual  function  calculated  is  no  longer  meaningful  and 
any  further  calculation  becomes  pointless. 

2)  Divergence  occurs  directly;  i.e.,  the  value  of  residual 
function  Sq  is  larger  after  the  iteration  than  it  was 
before. 

For  both  of  these  cases,  the  major  problem  is  that  the 
step  size  could  be  too  large.   If  we  decrease  the  step  size, 
the  situation  might  be  improved  to  some  extent.   We  still 
rely  on  Newton's  method  indicating  the  right  direction,  but 
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we  no  longer  apply  the  full  corrections  which  would  follow 
from  the  original  formalism.   In  this  chapter,  we  will 
discuss  the  combination  of  Newton's  method  with  the  method 
of  steepest  descent  and  other  sophisticated  methods. 

The  Method  of  Steepest  Descent 
We  retain  A=f^Twfg  and  introduce  a  numerical  factor  1/f 
into  the  normal  equation  (4-8),  which  thus  becomes 

1 

-  AS  =  -  f-w$   .  (5-1) 

f  a 

By  choosing  an  appropriate  value  of  /,  we  would  reduce  the 
step  size  and  make  the  residual  function  Sq  gradually 
smaller  and  smaller.   This  is  analogous  to  the  basic  idea  of 
"the  method  of  steepest  descent"  which  is  to  step  to  the 
next  in  a  sequence  of  better  approximations  to  the  solution 
by  moving  in  the  direction  of  the  negative  of  the  gradient 
of  Sg.   If  the  step  is  not  too  large,  the  value  of  Sq  must 
necessarily  decrease  from  one  iteration  to  the  next.   Our 
goal  is  to  arrive  at  the  stable  absolute  minimum  value  of  Sq 
and  to  the  corresponding  set  of  parameters  which  is  the  best 
solution. 

At  this  point,  two  new  problems  arise: 

1)  How  does  one  calculate  the  residual  function  Sq  and  its 
gradient? 

2)  How  does  one  choose  the  best  value  of  fl 

■ 
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Suppose  we  were  to  pretend  that  the  current  values  of  a 
at  any  iteration  are  the  true  values  of  a.   This  assumption 
is  of  course  not  true.   But  it  makes  it  in  general  possible 
to  estimate  the  vector  v  as  a  function  of  a.   This  is  so 
because  if  the  present  value  of  a  were  the  true  value  of  a, 
there  would  be  a  definite  set  of  residuals  v  which  causes 
the  corrected  observations  to  satisfy  the  conditions 
equations  rigorously.   It  is  not  difficult  to  find  those 
components  of  v  which  correspond  to  certain  value  of  a. 
Suppose  we  have  an  approximation  a  and  a  and  vq  of  v,  we 
wish  to  obtain  the  "best"  approximation  of  v,  assuming  a  to 
be  correct.  We  then  need  to  minimize 

-T  -1-    »>T- 
S  =  -  v  a  v  +  f  y. 

2 
relative  to  the  remaining  variables  v  and  y.  which  yields 

f(x0  +  v,a)  =  0  (5-2a) 

o_1v  +  fxTu  =  0  (5-2b) 

Setting  v=Vq+b,    and  expanding  (5-2a)  into  powers  of  e,  we 
have 

f(x0,a)  +  f£0e  =  0  (5-3) 


88 
and  Eq.  (5-2)  becomes 

a_1(v0+e)  +  fx0TU  =  °  <5"4) 

where  x0  =  xq  +  vq.   Solving  Eq.  (5-4)  for  e  yields 

e  =  -Vq  -  afXQTii   .  (5-5) 

Substituting  now  (5-5)  into  Eq.  (5-3)  for  e,  we  get 

f(x0,a)  -  fx0v0  -  fx0afx0Tu  =0.  (5-6) 

From  Eq.  (5-6)  we  obtain 

U  =  w[f (x0,a)  -  fxo^ol   •  (5-7) 

From  Eqs.  (5-5)  and  (5-7)  we  arrive  therefore  at  the 
expression 

v  =  -afx0w[f (x0,a)  -  fxoVol   /  (5-8) 

where  w=fXQTafXQ  and  xq  =  xq  +  vq  still.   This  equation  may 
be  iterated,  if  needed,  to  arrive  at  a  definite  v;  i.e., 
until  the  value  of  v  on  the  left  hand  side  and  the  value  of 
vq  on  the  right  hand  side  are  the  same.   There  is  thus  a 
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corresponding  v  to  each  choice  of  parameters  a  and  we  can 
write  v=v(a).   Although  Eg.  (5-8)  formally  depends  on  v0  in 
the  first  order,  it  is  in  fact  rather  insensitive  to  the 
actual  value  of  v0  used,  since  it  is  really  guadratic  in  vq. 
To  appreciate  this,  note  that 


f(x0,a)  -  fxo(x0-x0)  =  f(x0,a)  +  O(v02)   .    (5-9) 

After  we  get  the  best  value  of  v  which  corresponds  to  a 
definite  value  of  a,  S  is  given  by 


T-1       T         T  — 1  - 

S  =  -  v  a     V  +  £  \l  =  -   v  a     v   ; 


i.e. ,  S  =  S0, 

because  in  this  case,  we  already  have  f=0  for  this  value  of 
v.   Fixing  the  value  of  a,  we  can  therefore  get  the  best 
corresponding  value  of  v,  then  calculate  the  value  of  the 
residual  function  Sg  which  corresponds  to  this  value  of  a. 
In  other  words,  we  may  also  write  SQ=SQ(a). 

Furthermore,  when  we  calculate  the  best  value  of  v  at  a 
from  Eg.  (5-8) , 

v  =  -  afx0w$   , 


where  again  $  =  f(x,a)-f£v 


Hence,  the  expression  for  Sq  will  be 
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_T  -1  _       -T       -1     T  -       -T  - 

Sn  =  -  v  a     v  =  -  <J>  wf  ~ao     af~   w<J>  =  -  fr  w|      .  (5-10) 

u        2  2  x  x  2 


This  expression  is  exact  if  it  is  evaluated  with  the  optimum 
value  of  v.   In  fact,  however, 

$  =  f(x,a)  -  f£v  ■  f(x0,a) 

is  not  sensitive  to  the  value  of  v  used. 

From  Eq.  (5-10),  we  can  estimate  the  negative  gradient 
6g  of  Sq  with  respect  to  a: 


8S0    8$T   _    1    3w  _ 
-  6  =  =  w$  +  — $  —  $ 

g   3a    8a       2    8a 


-T  -T          ^ 

3V1  Sv1 

f +  f i  -  f i o(v) 

x  3a  a    x  3a 


-2 
w<)>  +  0(v  ) 


T  - 

"  fa  w* 


(5-11) 


an  expression  which  also  depends  only  weakly  on  v.   In 
deriving  Eq.  (5-11),  the  symmetry  of  w  has  been  used  to 
combine  terms,  and  terms  O(v^)  have  been  neglected.   The 
vector  6g  points  along  the  negative  gradient  of  Sq,  i.e., 
the  direction  of  steepest  descent.   At  this  point,  let  us 


91 

delineate  the  method  of  steepest  descent  in  our  problem 
first.   It  can  be  modified  as  follows. 

At  each  step,  use  the  current  values  of  v  and  a  to 
compute 

$  =  f(ic,a)  -  f£V   ,  (5-12a) 

v  =  -  afxT$   ,  (5-12b) 

and  iterate  Eqs.  (5-12)  above  to  get  the  best  value  v  at  a. 
Since  $  is  quite  insensitive  to  the  values  of  v,  this  should 
converge  rapidly.   Then  calculate  the  residual  function  at 
the  best  value  of  v,  using 

1  1 

Sn  =  -  vTa~1v       or  -  $Tw$   .  (5-12c) 

0    2  2 

The  corrections  to  the  parameters  are  next  computed  from 


6g  =  -  /faTw$   ,  (5-12d) 

an  =  a  +  6g  ( 5-12e) 


Now  an  is  available.   Using  the  procedure  (5-12a)  to  (5-12c) 
again,  we  can  get  the  new  value  Sg.   In  Eg.  (5-12d),  the 
proportionality  constant  f   must  be  chosen  judiciously  so 
that  all  components  of  an  do  not  exceed  reasonable  ranges 
and  the  new  value  of  SQ  is  smaller  than  the  old  one. 
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This  procedure  can  always  be  made  to  converge.   But,  as 
we  know,  the  method  of  steepest  descent  too  has  its  draw- 
backs, principally  because  typically  the  convergence  may  be 
rather  slow. 


The  Combination  of  Newton's  Method  with  the  Method 
of  Steepest  Descent—The  Modified  Newton  Scheme 

Because  of  the  drawbacks  of  the  method  of  steepest 
descent  we  would  like  to  incorporate  the  results  above  into 
alternative  algorithms  which  combine  the  best  of  the 
Newton's  method  with  the  method  of  steepest  descent  by 
"interpolating"  between  them;  i.e,  combine  the  certain 
improvement  obtained  by  the  steepest  descent  method  when  the 
available  approximation  is  far  from  the  definitive  solution, 
with  the  rapid  convergence  of  Newton's  method  when  the 
current  approximation  is  already  close  to  it. 

In  order  to  implement  this  idea,  Eg.  (5-12d)  must  be 
slightly  modified. 

Consider  the  eguation 

1 

-  D6  =  -  f-Tw$  (5-13) 

i  a 

and  compare  it  with  Egs.  (5-12d)  and  (4-8).   In  Eg.  (5-12d), 
D=I,  a  unit  matrix,  and  in  Eg.  (4-8),  f=l   and  D=A=f^Tw$. 
The  eguation  (5-12d)  leads  to  the  pure  steepest  descent 
solution,  which  turns  out  to  be  undesirable  in  general. 
Actually,  we  can  retain  the  flexibility  to  choose  D 
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more  freely.   As  Jefferys  remarked,  this  will  allow  us  to 
improve  certain  characteristics  of  the  solution. 

a 


Set  D=A=faTw$  in  Eq.  (5-13)  to  obtain 


-  AS  =  -  f ~Tw* 

f  a 


which  is  the  same  as  Eq.  (5-1). 

If  Eq.  (5-1)  is  used  instead  of  (5-12d),  the  only 


difference  between  the  normal  equations  of  Newton's  method 
and  Eq.  (5-1)  is  the  numerical  factor  f. 

We  adopt  a  modification  of  Newton's  method  which 
introduces  only  the  numerical  factor  f   into  the  normal 
equation  (4-8)  as  in  Eq.  (5-1).   In  this  alqorithm,  we 
choose  the  value  of  f   such  that  Sq(=$tw$),  evaluated  as  a 
function  of  f  ,  is  a  minimum.   Once  6  has  been  chosen  in  a 
current  iteration,  that  value  of  f   is  determined  which 
minimizes—for  this  iteration—the  value  of  Sq.   It  is 
plausible  that  this  optimizes  the  gain  of  accuracy  during 
this  iteration,  and  the  next  iteration  proceeds  from  there. 
In  other  words,  we  use  the  usual  normal  equations  (f=l)    to 
estimate  the  direction  of  the  vector  at  each  step,  and  then 
choose  the  value  of  f   to  get  the  optimum  length. 

For  arriving  at  the  optimum  value  of  f ,    the  so-called 
"Golden  section  method"  or  "0.618  method"  is  used.   This 
method  was  originally  conceived  for  searching  for  the 
minimum  of  a  function  of  one  variable.   The  absolute  minimum 
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may  be  found  by  this  method  if  the  function  has  only  a 
single  peak  or  a  strict  vertex.   Otherwise,  for  a  function 
with  more  than  one  peaks,  the  method  will  find  a  local 
minimum.   In  any  case,  a  point  at  which  the  function  is 
smaller  than  at  the  initial  point  can  be  always  reached. 

For  every  set  of  parameters  a,  there  is  a  corresponding 
best  set  of  corrections  v,  and  for  every  value  of  f ,  there 
is  a  corresponding  set  of  corrections  6.   Therefore, 
actually  we  can  write  So=Sq(/);  the  scalar  residual  function 
Sq  is  a  function  of  variable  f.      Anyway,  we  can  thus 
obviously  search  at  least  for  that  /  which  will  produce  a 
local  minimum  of  Sq . 

The  scheme  for  finding  best  value  of  i    (i.e.  the 
minimum  Sg)  at  each  iteration  consists  of  two  parts.   The 
first  of  these  is  to  delineate  an  interval  in  which  the 
minimum  is  located;  then  look  for  the  minimum  by  the  "0.618 
method"  within  this  interval.   The  searching  procedure  is  as 
follows. 

1)  Choose  u  as  the  length  for  the  initial  step  as  well  as  a 
positive  number  F=(./5-l)/2  (F=0.618)  and  a  small  quantity  e. 

2)  Define  /=10u;  Sq=Sq(/)  now  is  equivalent  to  S0=S0(u);  put 
u(0)=q.   The  case  u=o,  i.e.,  f=l,    corresponds  to  Newton's 
method.   The  superscript  indicates  the  sequence  number  of 
the  current  iteration. 

If  S0(u(0)  +  u)<S0(u(°)),  go  to  step  3. 
If  S0(u<0)  +  u)>S0(u<°>),  put 
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u  =  -  u   , 

u(-l)  =  u(0)  +  u   f 

and  also  go  to  step  3 . 

3)  Calculate  u(k+1)=u(k)  +  u  and  S0(u(k+1h. 

4)  If  S0(u(k+1h<S(u(kh,  then 

u=2u;    K=K+1;    go   to    step    3. 
If   S0(u(k+1h    >   S(u<k)),    then 

u3(0)    =  u<k+1),    Ul(°)    =  u(k~D       ; 

d(0)    =  u3(0)-Ul(0)       ; 

i   =    0       ; 

go  to  step  5 . 

5)  Let  y^1)    =  u^iJ-Fd*1),    y2^)    =  u^iJ+Fd*1). 

6)  If   S0(Yi(1h    <   S0(y2(i)),    then 

d(i+l)    =  y2(i)    _  Ul(i)    f   Ul(i+D    =  ux(i)      , 

u3(i+l)    =  y2(i)    and 

go  to   step  7 . 

If   S0(y1(1))    >   S0(y2(1)),    then 

d(i+l)    =  U3(i)    _  yi(i)      f        Ul(i+D    =  Yl(i) 

u3(i+D   =  u3(i)    and 

also  go  to  step  7. 

If   Soty!^1)    =  S0(y2(i)),    then 

d(i+l)   =  y2(i)_Ul(i)   =  U3(i)    _  yi(i)    and 

Ul(i+D    =  u^1)    ,        u3(i+1)    =  y^1)        or 
Ul(i+D    =  Yl(i)    ,        u3(i+D    =  u3(i) 
and  also  go  to  step  7 . 

7)  If    | cat  i+1 )  j  <e  ,    then  stop  and 
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%in  -  ui«i+1)  ♦  d(i+1'/2   , 

Otherwise  i=i+l  and  return  to  step  5. 
This  process  searches  for  the  minimum  of  Sg  at  each  itera- 
tion and  accelerates  the  convergence. 

In  summation,  Newton's  modified  method  consists  of  the 
following  steps. 
Step  1. 

Using  Egs.  (5-12a)  and  (5-12b)  iteratively,  get  the 
best  value  of  v  at  a  (initially,  set  a=ag  and  since  v  is 
unknown,  set  vq=0) .   This  converges  rather  rapidly. 
Step  2. 

Compute  S0=So(a)=$Tw$  at  a 
Step  3. 

Set  the  starting  value  for  f   egual  to  1,  i.e.,  u=0  and 
use  Eg.  (5-1)  (in  this  case,  eguivalent  to  Eg.  (4-8)),  to 
compute  S. 
Step  4. 

Calculate  an  from  an  =  a  +  6 
Step  5. 

Check  every  element  in  an  to  see  if  they  are  all 
located  in  the  ranges  of  the  allowed  values  (e.g.,  a>0, 
0<e<l,  n>0,...).   If  not,  reduce  u=u-0.5  (i.e.,  f=10u_0-5) 
and  recompute  the  correction  vector  6  using  the  eguation 

6'  =  /S   ,  (5-14) 
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and  set  6=6',  return  to  step  4. 

This  step  may  be  iterated  until  every  element  in  a  has 
converged  to  a  reasonable  value. 
Step  6. 

Repeat  the  process  of  step  1  to  get  the  best  v  at  an 
and  compute 

S0'  =  S0(an)  =  s0(a+6). 

If  Sq'^Sq/  set  u=u-0.5  and  return  to  step  5. 
Step  7. 

If  S0'<So  and  u=0  (/=1),  then  go  to  step  9. 

If  Sq'<So  and  u*0  (f*l),    then  proceed  to  next  step. 
Step  8. 

Use  the  "0.618  method"  to  find  the  optimum  value  of  / 
to  get  the  minimum  of  the  residual  function,  Smj_n,  at  this 
iteration,  and  the  corresponding  optimum  improvement  of  an. 
Step  9. 

Test  for  convergence.   That  is,  test  again  the  size  of 
each  component  of  6  and  v  against  the  corresponding  com- 
ponent in  a  and  x.  When  the  change  is  sufficiently  small 
for  all  components,  the  process  may  be  said  to  have 
converged.   If  the  convergence  has  not  yet  been  achieved, 
return  to  step  3  for  next  iteration. 

The  scheme  described  above  is  efficient.   In  comparison 
with  Newton's  method  which  was  described  in  the  last 
chapter,  we  find  two  essential  differences: 
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1)  In  this  method,  a  corresponding  best  estimate  v  has  to  be 
found  for  every  estimate  a  by  using  Eg.  (5-12a)  and  (5-12b) 
iteratively.   In  Newton's  method,  this  is  not  taken  into 
account. 

2)  In  the  method  proposed  here,  the  numerical  factor  f   is 
applied  to  the  normal  eguation  for  5  and  the  best  value  of 
f,   which  leads  to  a  minimum  Smj_n,  must  be  found  by  the 
"0.618  method". 

We  see,  however,  that  if  the  an,  which  is  computed  from 
that  value  of  5  which  is  the  solution  of  the  same  normal 
eguations  for  6  as  in  Newton's  method  (here,  f=l) ,    already 
produces  a  smaller  value  of  Sg,  the  scaler  residual  func- 
tion,  one  does  not  need  to  bother  with  searching  for  the 
best  value  of  f   for  the  iteration.   This  iteration  is  then 
similar  to  what  one  would  do  in  Newton's  method. 

Jefferys  (1981)  points  out  that  this  method  is  somewhat 
like  a  modified  Fletcher-Powell  algorithm  (1963).   We 
therefore  call  this  as  "FP  method".   In  addition,  as  a 
slight  further  modification  to  scheme  above,  Marguardt's 
algorithm  (D.  W.  Marguardt,  1963)  can  also  be  incorporated 
to  suggest  another  approach  for  solving  our  orbit  problem. 

The  Application  of  Marguardt's  Algorithm 
In  a  sense,  the  basic  idea  behind  Marguardt's  algorithm 
is  to  interpolate  between  Newton's  method  and  the  method  of 
steepest  descent  such  that  initially,  far  from  the  solution, 
steepest  descent  dominates,  and  that  the  more  efficient 
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Newton  method  dominates  the  calculations  as  the  solution  is 
approached.   This  is  similar  to  what  we  have  done  above  in 
our  improved  Newton's  method. 

Comparison  of  Marguardt's  paper  (1963)  with  Eq.  (5-1) 
suggests  that  the  natural  generalization  of  this  fundamental 
eguation  is 


1  T 

(A  +  -  D)  S  =  -  f-  w<|)   ,  (5-15) 

f  a 


where  D  is  to  be  chosen  appropriately.  Because  the  gradient 
methods  are  not  scale  invariant,  it  is  desirable  to  choose  D 
so  that  it  has  the  form 

D  =  diag  (A]^,  A22/  •••/  A77)/  (5-16) 


where  Ajj  are  the  elements  of  the  matrix  A.   The  effect  of 
this  is  to  scale  the  gradient  along  each  coordinate  axis  by 
the  factor  Ajj-"'"   • 

This  is  also  exactly  Jefferys'  (1981)  suggestion.   In 
our  case,  Marguardt's  algorithm  would  take  the  same  form  as 
described  above  for  the  improved  Newton  scheme,  namely  the 
"FP  method",  except  for  step  3,  where  Eq.  (5-1)  would  have 
to  be  replaced  by 


1 
(A  +  -  D)6  =  -f-Tw$   . 
i 
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We  call  this  the  "MQ  method". 

Using  the  two  modifications  of  Newton's  method  just 
described  (FP  and  MQ  method,  respectively)  for  all  observa- 
tion data,  the  same  result  should  be  obtained.   Our  calcula- 
tion confirmed  this  for  all  the  practical  examples  in  our 
paper.   We  noticed,  however,  that  Marguardt's  method 
converges  much  faster  than  the  other  one.   The  reason  for 
this  might  be  as  follows. 

Comparing  the  two  eguations  (5-1)  and  (5-15): 


1         T 

-  A6  =  -f-  w*  and 

i  a 


1  T 

(A  +  -  D)S  =  -f-  w<|>   . 

f  a 

where  A=f£Twfg  and  D=diag( &\\,    A22/  •••»  A77)/  we  see  tftat 
in  eguation  (5-1),  which  is  used  in  "FP  Method",  the  only 
difference  from  A5=-f^Tw$,  the  normal  eguation  in  the  Newton 
scheme,  is  the  numerical  factor  f ;  all  elements  in  A  are 
multiplied  by  the  same  factor  1/f.      The  solution  S  is 
therefore  only  changed  by  the  same  numerical  factor,  i.e., 
the  ratios  between  all  elements  in  6  remain  unchanged. 
However,  in  the  "MQ  method",  as  seen  in  Eg.  (5-15),  the 
coefficient  matrix  of  6  is  modified  as 


1 
A  +  -  D 
f 
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in  which  only  the  diagonal  elements  are  changed.   The 
coefficient  matrix  of  S  is  gradually  becoming  diagonal  in 
the  course  of  the  "MQ"  iterations.   This  means  that  the 
corrections  to  the  parameters  (but  not  the  parameters 
themselves)  become  uncorrelated.   However,  we  have  to  notice 
that  the  inverse  of  the  coefficient  matrix  of  6  and  the 
covariance  matrix  of  the  parameters  are  no  longer  the  same 
in  this  method,  since  the  covariance  matrix  of  the  para- 
meters is  intrinsically  the  inverse  of  A.   Thus  we  calculate 
only  the  "efficiency"  of  the  adjusted  parameters  in  final 
solution. 

Two  Practical  Examples 
Table  5-1  lists  the  observations  for  [3738.   These 
observations  were  collected  by  Heintz.   Table  5-2  contains 
the  initial  data.   All  the  position  angles  in  the  observa- 
tions have  been  reduced  to  equator  2000.0.   Figure  5-1  shows 
all  the  observation  points  plotted  in  the  Xg-yg  plane.  In 
Figure  5-2,  the  Xg,  yg  coordinates  are  respectively  plotted 
against  the  observing  epoch.   From  Figure  5-2,  we  see  that 
the  four  earliest  observations  are  scattered  widely.   The 
initial  approximate  solution  is  listed  in  Table  5-3. 

Initially,  the  same  weights  were  assigned  to  all  26 
observations.  The  same  final  solution  is  obtained  using 
both  the  "FP"  and  the  "MQ"  scheme  as  shown  in  Table  5-4; 
this  is  called  "solution  #1"  for  (3738.  "FP"  requires  13 
iterations  and  spent  12m06?18  CPU  time  on  the  VAX  while  "MQ" 
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requires  only  10  iterations  and  9mlls01  CPU  time  for 
arriving  at  the  same  final  result.   The  covariance,  the 
correlation  and  the  transformation  matrices  as  well  as  the 
standard  deviations  of  the  parameters  in  this  solution  are 
also  listed  in  Table  5-4. 

For  evaluating  a  final  result,  the  residuals  are 
important.   They  must  be  random  and  reasonably  small.   The 
residuals  in  p,  9,  x  and  y  for  this  solution  are  listed  in 
Table  5-5,  which  shows  that  some  are  very  big,  especially 
those  for  the  four  earliest  points  which  were  observed 
before  1921.   In  Figures  5-3  and  5-4,  the  residuals  in  p,  8, 
x  and  y  are  respectively  plotted  against  the  observing 
epoch.   Also,  Figure  5-5  displays  all  the  initial  observa- 
tion points  compared  with  those  after  correction;  short  bars 
connect  every  two  corresponding  points.   From  the  relevant 
tables  and  plots  above,  we  see  that  the  residuals  seem  to  be 
not  very  reasonable. 

Heintz  had  already  computed  an  orbit  for  3738  from  the 
same  observation  data.   His  result  is  shown  in  Table  5-6 
which  differs  greatly  from  the  solution  #1.   This 
discrepancy  originates  in  earliest  four  uncertain  observa- 
tions.  These  weak  observations  affect  the  overall  result. 
Giving  all  the  observations  the  same  weights  is  obviously 
inappropriate.   We  therefore  assigned  a  much  lower  weight 
(0.03)  to  the  earliest  four  observations.   Using  these 
weighted  observation  data,  both  the  "FP"  and  "MQ"  scheme 
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still  yield  exactly  the  same  solution  which  agrees  fairly 
well  with  Heintz'  result.   This  solution  is  listed  in  Table 
5-7  and  called  "solution  #2"  for  (3738.   This  time,  both 
methods  run  through  only  6  iterations  and  used  the  same  CPU 
time,  around  2m  on  the  VAX,  because  in  all  iterations  by 
both  methods  Newton's  method  dominates,  that  is,  neither 
"FP"  nor  "MQ"  has  been  really  called  into  action.   The 
residuals  in  p,  6,  x  and  y  in  this  solution  are  generally 
much  smaller  than  in  solution  #1,  which  are  listed  in  Table 
5-8,  and  plotted  in  Figures  5-6  and  5-7.   Figure  5-8  shows 
again  the  comparison  of  initial  observations  with  those 
after  correction.   Table  5-7  shows  also  the  final 
covariance,  correlation  and  transformation  matrices  as  well 
as  the  standard  deviations. 

In  addition,  we  also  removed  the  four  observations 
before  1921  from  the  input  data.   Using  only  the  22  observa- 
tions that  remained,  the  solution  is  essentially  the  same  as 
the  weighted  26  observations  by  both  the  methods  as  we  would 
expect. 

Another  example  is  BD+19°5116.   The  observations  (35 
points)  and  the  reduced  initial  data  are  shown  in  Tables  5-9 
and  5-10.   The  observations  are  again  plotted  in  the  Xg-yo 
plane  in  Figure  5-9,  and  x0,  yo  coordinates  are  also  plotted 
against  the  time  of  observation  in  Figure  5-10.   From  Figure 
5-7,  we  see  that  the  observations  cover  only  a  very  short 
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arc  on  the  ellipse.   The  computation  from  these  data  will  be 
more  difficult. 

Using  the  "MQ"  method,  we  arrived  at  a  final  solution 
which  is  shown  in  Table  5-12.   Only  5  iterations  are  needed 
and  around  9m  CPU  time  is  used.   The  residuals  in  p,  6,  x 
and  y  are  listed  in  Table  5-13,  and  plotted  in  Figures  5-11 
and  5-12.   The  initial  observations  are  again  compared  with 
those  after  correction  in  Figure  5-13.   Also,  the 
covariance,  the  correlation  and  the  transformation  matrices 
and  the  standard  deviations  are  given  in  Table  5-12. 

For  these  data,  the  "FP"  scheme  was  also  used. 
Although  it  gave  the  same  result,  it  ran  through  around  900 
iterations  and  reguired  4  hours  CPU  time! 
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Table  5-1. 


The  Observation  Data  for  3738 
(courtesy  W.  D.  Heintz) 


(3  738      02232     s2952    (2000)      7.5-7.8     F8 


1879.70 

183.1 

0.64 

2 

1891.80 

174.7 

0.55 

3 

1900.21 

177.5 

.75 

3 

05.98 

193.8 

.66 

2 

21.15 

50.5 

.44 

10 

26.67 

43.5 

.46 

15 

29.77 

40.0 

.50 

8 

32.45 

38.5 

.55 

15 

34.76 

37.3 

.57 

11 

38.16 

35.0 

.59 

14 

42.52 

32.3 

.57 

7 

45.80 

30.1 

.62 

12 

47.98 

30.4 

.44 

8 

55.00 

25.6 

.22 

4 

56.45 

13.2 

.19 

4 

59.03 

352.1 

.13 

10 

59.97 

307.1 

.10 

6 

61.03 

249.2 

.11 

3 

62.53 

222.7 

.14 

2 

64.58 

223.5 

.19 

14 

66.02 

217.5 

.26 

4 

66.94 

220.9 

.32 

5 

69.04 

218.8 

.56 

3 

75.55 

217.4 

.62 

9 

77.28 

217.1 

0.75 

7 

1983.71 

215.4 

1.03 

2 

p 

3 

Doo  2  Booth  1 

01 

6 

V  8  B  7 

B  V  4 

<p  8  B  4  6  3 

B  V  4  6  3 

Sim  10  B  4 

B  4  V  3 

Strom  8  V  4 

B 

B  2  cp  int  2 

B  2  <p  int  2 

cp  int  7  B  3 

cp  int  3  B  3 

cp  int 

cp  int 

cp  int  10  B  4 

cp  int  2  B  2 

Kni 

cp  int 

Wor  7  Hln  2 

Hln  4  hz  3 

hz 


Finsen  published  an  elliptical  (P=110yr)  and  a  parabolic 
solution  (cf  the  1969  orbit  catalogue).   Heintz'  result  is: 
P  290  yr,  T  1952.0,  a  1.460,  e  0.63,  i  96.7,  to  38.0,  node 
29.4  (2000). 

The  position  angles  are  oriented  to  equator  2000.   The 
observations  before  1920  are  from  Northern  instruments,  and 
are  correspondingly  uncertain. 
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Table   5-2. 
The  Reduced  Initial  Data   for   (3738 


t 

e0 

P0 

x0 

Y0 

1 

1879.70 

183.1 

0.64 

-0.639 

-0.035 

2 

1891.80 

174.7 

0.55 

-0.548 

0.051 

3 

1900.21 

177.5 

0.75 

-0.749 

0.033 

4 

1905.98 

193.8 

0.66 

-0.641 

-0.157 

5 

1921.25 

50.5 

0.44 

0.280 

0.340 

6 

1926.67 

43.5 

0.46 

0.334 

0.317 

7 

1929.77 

40.0 

0.50 

0.383 

0.321 

8 

1932.45 

38.5 

0.55 

0.430 

0.342 

9 

1934.76 

37.3 

0.57 

0.453 

0.345 

10 

1938.16 

35.0 

0.59 

0.483 

0.338 

11 

1942.52 

32.3 

0.57 

0.482 

0.305 

12 

1945.80 

30.1 

0.62 

0.536 

0.311 

13 

1947.98 

30.4 

0.44 

0.380 

0.223 

14 

1955.00 

25.6 

0.22 

0.198 

0.095 

15 

1956.45 

13.2 

0.19 

0.185 

0.043 

16 

1959.03 

352.1 

0.13 

0.129 

-0.018 

17 

1959.97 

307.1 

0.10 

0.060 

-0.080 

18 

1961.03 

249.2 

0.11 

-0.039 

-0.103 

19 

1962.53 

222.7 

0.14 

-0.103 

-0.095 

20 

1964.58 

223.5 

0.19 

-0.138 

-0.131 

21 

1966.02 

217.5 

0.26 

-0.206 

-0.158 

22 

1966.94 

220.9 

0.32 

-0.242 

-0.210 

23 

1969.04 

218.8 

0.56 

-0.436 

-0.351 

24 

1975.55 

217.4 

0.62 

-0.493 

-0.377 

25 

1977.28 

217.1 

0.75 

-0.598 

-0.452 

26 

1983.71 

215.4 

1.03 

-0.840 

-0.597 
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Table  5-3. 


P0        T0        ajj      eQ        ij      <*q  Bo 

156.65     1955.2     0.958     0.523     105.2     57.4     31.2 
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Table  5-5. 

Residuals  of  the  Observations  in  9,p 
and  x,y  for  Solution  #1  of  3738 


t 

ve 

VP 

vx 

VY 

1 

1879.70 

22.281 

0.201 

-0.121 

-0.326 

2 

1891.80 

24.309 

0.229 

-0.189 

-0.305 

3 

1900.21 

15.235 

-0.138 

0.153 

-0.168 

4 

1905.98 

-8.507 

-0.203 

0.186 

0.115 

5 

1921.25 

30.522 

-0.239 

-0.248 

-0.141 

6 

1926.67 

7.713 

-0.121 

-0.122 

-0.053 

7 

1929.77 

3.396 

-0.076 

-0.075 

-0.030 

8 

1932.45 

0.244 

-0.059 

-0.047 

-0.035 

9 

1934.76 

-1.703 

-0.030 

-0.014 

-0.031 

10 

1938.16 

-3.224 

0.002 

0.020 

-0.026 

11 

1942.52 

-4.821 

0.041 

0.060 

-0.023 

12 

1845.80 

-5.931 

-0.038 

-0.006 

-0.073 

13 

1947.98 

-8.737 

0.100 

0.122 

-0.023 

14 

1955.00 

-19.091 

0.075 

0.095 

-0.062 

15 

1956.45 

345.659 

0.043 

0.048 

-0.048 

16 

1959.03 

-23.659 

0.007 

-0.012 

-0.054 

17 

1959.97 

0.265 

0.020 

0.013 

-0.016 

18 

1961.03 

31.476 

0.014 

0.062 

-0.019 

19 

1962.53 

30.273 

0.024 

0.055 

-0.062 

20 

1964.58 

10.992 

0.058 

-0.006 

-0.071 

21 

1966.02 

10.225 

0.051 

-0.003 

-0.072 

22 

1966.94 

3.779 

0.032 

-0.008 

-0.038 

23 

1969.04 

0.972 

-0.120 

0.098 

0.069 

24 

1975.55 

-5.478 

0.046 

-0.073 

0.024 

25 

1977.28 

-6.502 

-0.039 

-0.014 

0.091 

26 

1983.71 

-8.721 

-0.208 

0.105 

0.228 
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Heintz'    Result   for   (3738 
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Table  5-8. 

Residuals  in  p,0,x  and  y  of  the  Observations 
for  3738  in  Solution  #2 


t 

ve 

VP 

vx 

VY 

1 

1879.70 

2.160 

-0.207 

0.208 

-0.005 

2 

1891.80 

-10.692 

-0.309 

0.316 

0.016 

3 

1900.21 

-54.998 

-0.585 

0.661 

0.106 

4 

1905.98 

-107.227 

-0.473 

0.652 

0.344 

5 

1921.25 

-2.623 

-0.039 

-0.011 

-0.042 

6 

1926.67 

-0.680 

0.020 

0.018 

0.009 

7 

1929.77 

0.579 

0.019 

0.011 

0.016 

8 

1932.45 

0.384 

-0.002 

-0.004 

0.002 

9 

1934.76 

0.248 

-0.002 

-0.003 

0.001 

10 

1938.16 

0.715 

-0.004 

-0.007 

0.004 

11 

1942.52 

1.131 

0.011 

0.003 

0.015 

12 

1845.80 

1.485 

-0.073 

-0.070 

-0.024 

13 

1947.98 

-0.228 

0.066 

0.058 

0.032 

14 

1955.00 

-4.324 

0.043 

0.046 

0.000 

15 

1956.45 

3.020 

0.006 

0.004 

0.011 

16 

1959.03 

-5.384 

-0.047 

-0.048 

-0.001 

17 

1959.97 

4.197 

-0.039 

-0.020 

0.034 

18 

1961.03 

16.641 

-0.034 

0.034 

0.027 

19 

1962.53 

16.383 

-0.003 

0.032 

-0.023 

20 

1964.58 

4.004 

0.044 

-0.020 

-0.041 

21 

1966.02 

6.319 

0.041 

-0.011 

-0.050 

22 

1966.94 

1.320 

0.024 

-0.013 

-0.022 

23 

1969.04 

0.912 

-0.122 

0.099 

0.071 

24 

1975.55 

-1.468 

0.079 

-0.073 

-0.033 

25 

1977.28 

-1.755 

0.010 

-0.022 

0.013 

26 

1983.71 

-1.629 

-0.068 

0.040 

0.062 
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Table  5-9. 

The  Observation  Data  for  BD+19°5116 
(courtesy  W.  D.  Heintz) 


+19°5116 

23319 

N1956 

(2000) 

10.4-12.1  M4Ve 

1945.473 

174.47 

3.563 

2n 

Ilex   S 

49.809 

165.72 

3.430 

3 

21 

51.964 

161.08 

3.435 

3 

14 

54.804 

154.45 

3.583 

2 

10 

55.971 

152.54 

3.565 

2 

10 

58.482 

147.18 

3.616 

3 

14 

60.152 

143.91 

3.648 

6 

50 

61.682 

141.51 

3.708 

4 

46 

62.835 

139.46 

3.759 

8 

70 

63.757 

138.49 

3.770 

4 

31 

64.860 

136.50 

3.810 

6 

56 

65.766 

134.67 

3.840 

3 

23 

68.289 

129.65 

3.967 

2 

21 

69.838 

127.34 

4.101 

6 

76 

70.792 

126.12 

4.098 

2 

28 
36 

71.939 

124.54 

4.134 

5 

73.903 

121.48 

4.236 

7 

32 

75.872 

118.90 

4.297 

2 

16 

77.049 

117.65 

4.361 

4 

25 

78.904 

115.01 

4.431 

2 

12 

80.529 

112.47 

4.621 

3 

12 

81.856 

111.72 

4.602 

4 

18 

1983.581 

108.33 

4.676 

3 

19 

1941.75 

182.17 

3.678 

9 

M 

42.65 

179.70 

3.643 

6 

43.79 

177.20 

3.750 

5 

44.72 

174.82 

3.679 

4 

46.03 

172.59 

3.605 

5 

49.75 

165.33 

3.644 

4 

1955.78 

153.38 

3.605 

5 

1952.19 

161.8 

3.52 

7 

VB 

59.39 

145.2 

3.82 

6 

C  Wor  3 

62.07 

141.1 

3.76 

11 

B  VB  4  C  3 

65.53 

137.0 

3.97 

7 

VB 

1981.64 

112.3 

4.24 

2 

hz 

Pg.  positions  from  parallax  plates  at  Swarthmore  (S) 
refractor  (meas.  Heintz),  from  McCormick  (M)  refractor 
(meas.  Eichhorn);  few  micrometer  observations.   Position 
angles  eguator  2000. 
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Table  5-10. 
The  Reduced  Initial  Data  for  BD+19°5116 


6q        P0  x0  Y0 


1 

1941.750 

182.17 

3.678 

2 

1942.650 

179.70 

3.643 

3 

1943.790 

177.20 

3.750 

4 

1944.720 

174.82 

3.679 

5 

1945.473 

174.47 

3.563 

6 

1946.030 

172.59 

3.605 

7 

1949.750 

165.33 

3.644 

8 

1949.809 

165.72 

3.430 

9 

1951.964 

161.08 

3.435 

10 

1952.190 

161.80 

3.520 

11 

1954.804 

154.45 

3.583 

12 

1955.780 

153.38 

3.605 

13 

1955.971 

152.54 

3.565 

14 

1958.482 

147.18 

3.616 

15 

1959.390 

145.20 

3.820 

16 

1960.152 

143.91 

3.648 

17 

1961.682 

141.51 

3.708 

18 

1962.070 

141.10 

3.760 

19 

1962.838 

139.46 

3.759 

20 

1963.757 

138.49 

3.770 

21 
22 

1964.860 

136.50 

3.810 

1965.530 

137.00 

3.970 

23 

1965.766 

134.67 

3.840 

24 

1968.289 

129.65 

3.967 

25 

1969.838 

127.34 

4.101 

26 

1970.792 

126.12 

4.098 

27 

1971.939 

124.54 

4.134 

28 

1973.903 

121.48 

4.236 

29 

1975.872 

118.90 

4.297 

30 

1977.049 

117.65 

4.361 

31 

1978.904 

115.01 

4.431 

32 

1980.529 

112.47 

4.621 

33 

1981.640 

112.30 

4.240 

34 

1981.856 

111.72 

4.602 

35 

1983.581 

108.33 

4.676 

3.6754 

-0.1393 

3.6430 

0.0191 

3.7455 

0.1832 

3.6640 

0.3322 

3.6454 

0.3434 

3.5749 

0.4649 

3.5252 

0.9228 

3.3240 

0.8460 

3.2494 

1.1138 

3.3439 

1.0994 

3.2326 

1.5453 

3.2229 

1.6153 

3.1633 

1.6439 

3.0388 

1.9599 

3.1368 

2.1801 

2.9479 

2.1489 

2.9023 

2.3078 

2.9262 

2.3611 

2.8567 

2.4433 

2.8231 

2.4986 

2.7637 

2.6226 

2.9035 

2.7075 

2.6996 

2.7309 

2.5313 

3.0544 

2.4874 

3.2605 

2.4157 

3.3103 

2.3439 

3.4053 

2.2120 

3.6126 

2.0767 

3.7619 

2.0238 

3.8630 

1.8733 

4.0155 

1.7661 

4.2707 

1.6089 

3.9229 

1.7031 

4.2753 

1.4706 

4.4387 

125 


CM 


CD 


CO 


CD 
CM 


CD 


CD 

d 


CD 


CD 


O 

d 

I 


o 

>1 

o 

B 

p 

■•-{ 


o 

in 

CM 



0 

1 

u 

iH 

i_ 

ft 

D 

PQ 

-* 

M— 

M 

CM 

O 

0 

1 

OT 

m 

XJ 

-p 

CO 

c 
o 

CM 

o 

o 

1 

(D 

•H 

w 

-P 
> 

(N 

o 

H 

0) 

X 

CO 

rO 

-Q 

(OJD  p   SpU009S)   °X 


a) 

X! 
-P 

in 

O    • 
0 

p  e 
o  ns 


I 

m 

0) 

M 

& 

•H 

64 


(OJD  JO  SpUOOGS)   °/\ 


126 


CD 


CD 


CD 

CN 


CD 


CD 
O 


o 


♦ 

I 

o 

1 

1 

1 

♦ 

♦  oo 
o 

♦ 

o 

♦ 

o 
o 

- 

♦ 

o 

♦             o 

♦  o 

♦  o 

♦            o 

♦ 

o 

t 

♦ 

0    o 
o 

♦        CO 

♦       o 

*>o 

* 

o 

o 

X 

1 

o 

o 

1 

1 

« 

♦ 
o     ♦ 
o       ♦ 

1 

LO 
00 
CD 


1^ 

I 


ro 

CD 

• 

• 

CM 

CN 

1 

1 

LO 


CD 

r^ 

en 

*"* 

c 

o 

r^ 

■+-> 

CD 

u 

CD 

> 

L_ 

CD 

CO 

-Q 

O 

00 

14— 
O 

CD 

m 

CD 
CD 


O 
<* 

CD 


I 


(ojd  jo  spuooas)  °x 


S1 

-H 
t 

a) 

Xi 
O 

-p 
-p 

•H 

aj    in 
o 

d)       iH 
4-> 

rcJ 
G 
-H 

0 
O 

o 
I 


^> 


+ 
a 

CO 

M 

O 
m 

c 


o  O 

-P 
—  (0 
XI     > 

o  tn 


(U 

-P 


d) 

A 

-P 


O 
0 


-P 
O 

rH 

04 


CO 

u 

o 
o< 
a) 


•H 

fa 


127 


Table  5-11. 
The  Initial  Approximate  Solution  a0  for  BD+19°5116 


P()(Yrs) 

T0 

a0 

e0 

.  0 

i0 

0 

0)0 

Qq 

125.6 

1891.1 

5.31 

0.61 

115.2 

94.0 

76.0 
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Table  5-13. 

Residuals  of  the  Observations  for  BD+19°5116 
in  6,  p,  x  and  y 


t 

ve 

VP 

vx 

vy 

1 

1941.750 

-1.3854 

-0.0141 

0.0119 

0.0891 

2 

1942.650 

-0.6496 

0.0041 

-0.0037 

0.0414 

3 

1943.790 

-0.3677 

-0.1215 

0.1226 

0.0173 

4 

1944.720 

0.1869 

-0.0636 

0.0623 

-0.0175 

5 

1945.473 

-0.9591 

0.0433 

-0.0368 

0.0636 

6 

1946.030 

-0.1746 

-0.0046 

0.0060 

0.0103 

7 

1949.750 

-0.3556 

-0.0633 

0.0670 

0.0054 

8 

1949.809 

-0.8641 

0.1506 

-0.1323 

0.0894 

9 

1951.964 

-0.5481 

0.1505 

-0.1311 

0.0812 

10 

1952.190 

-1.7206 

0.0667 

-0.0282 

0.1226 

11 

1954.804 

0.4275 

0.0270 

-0.0359 

-0.0127 

12 

1955.780 

-0.4239 

0.0180 

-0.0040 

0.0320 

13 

1955.971 

0.0417 

0.0608 

-0.0552 

0.0257 

14 

1958.482 

0.5414 

0.0549 

-0.0648 

0.0005 

15 

1959.39 

0.7957 

-0.1293 

0.0772 

-0.1161 

16 

1960.152 

0.6524 

0.0607 

-0.0737 

0.0015 

17 

1961.682 

0.2183 

0.0404 

-0.0405 

0.0140 

18 

1962.070 

-0.0807 

-0.0008 

0.0039 

0.0036 

19 

1962.835 

0.1737 

0.0224 

-0.0244 

0.0058 

20 

1963.757 

-0.5043 

0.0394 

-0.0072 

0.0511 

21 

1964.860 

-0.4534 

0.0348 

-0.0042 

0.0459 

22 

1965.530 

-2.1135 

-0.1028 

0.1743 

0.0324 

23 

1965.766 

-0.1890 

0.0353 

-0.0157 

0.0341 

24 

1968.289 

0.6033 

-0.0009 

-0.0314 

-0.0275 

25 

1969.838 

0.4151 

-0.0756 

0.0227 

-0.0779 

26 

1970.792 

0.1332 

-0.0350 

0.0130 

-0.0339 

27 

1971.939 

-0.0557 

-0.0250 

0.0174 

-0.0183 

28 

1973.903 

0.0669 

-0.0465 

0.0202 

-0.0422 

29 

1975.872 

-0.1861 

-0.0257 

0.0246 

-0.0158 

30 

1977.049 

-0.5783 

-0.0409 

0  0577 

-0.0161 

31 

1978.904 

-0.4527 

-0.0345 

0.0461 

-0.0167 

32 

1980.529 

-0.0453 

-0.1591 

0.0641 

-0.1457 

33 

1981.640 

-1.2982 

0.2654 

-0.0058 

0.2832 

34 

1981.856 

-0.9916 

-0.0883 

0.1055 

-0.0537 

35 

1983.581 

0.2496 

-0.0976 

0.0118 

-0.0990 
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CHAPTER  VI 
DISCUSSION 


For  solving  our  orbit  problem,  the  Newton  scheme  and 
its  modifications  ("FP"  and  "MQ"  algorithms)  have  been 
discussed  in  detail.   As  we  have  seen,  the  Newton  scheme  is 
powerful  and  converges  very  fast  if  the  observational  errors 
are  sufficiently  small  and  the  initial  approximations  to  the 
parameters  are  sufficiently  accurate.   But  in  other  cases, 
Newton's  scheme  fails  to  converge.   Some  modifications  of  it 
were  carried  out  for  these  cases.   Two  other  schemes,  "FP" 
and  "MQ" ,  are  proposed.   These  two  converge  to  a  definitive 
solution  even  when  Newton's  scheme  diverges.   As  we  have 
seen,  "FP"  method  only  shortens,  however,  the  length  of  the 
step  and  retains  all  other  features  of  Newton's  scheme,  so 
that  sometimes  it  converges  still  very  slow.   The  "MQ" 
method  seems  more  powerful  because  it  not  only  shortens  the 
length  of  the  step  but  also  makes  the  corrections  to  the 
parameters  gradually  uncorrelated  in  subsequent  iterations. 
"MQ"  method  converges  much  faster  than  "FP"  does. 

In  addition  to  which  scheme  is  used,  two  things  are 
important  about  the  observation  data  themselves. 
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1)  The  sufficiency  of  observation  data  itself  is  of 
importance.   Whether  a  set  of  observations  is  suitable  for 
computing  an  orbit  depends  on  the  amount,  consistency  and 
distribution  of  the  data.   Some  observation  data  define  only 
a  short  arc  of  small  curvature  variation  like  in  our  example 
for  BD+19°5116.   The  orbit-computation  from  such  data  is 
relatively  more  difficult. 

2)  The  weighting  of  observation  is  also  of  importance. 
The  more  a  computation  is  based  on  more  precise  observations 
the  better  the  chances  are  of  practical  success.   If  using 
weak  data  is  unavoidable,  low  weights  must  be  assigned  to 
them,  in  accordance  with  standard  practice  in  the  field. 
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